/* ---- Google Analytics Code Below */

Wednesday, April 19, 2023

Demystifying LLMs with Amazon distinguished scientists

 Good piece

Demystifying LLMs with Amazon distinguished scientists

April 18, 2023 • 2419 words

Werner, Sudipta, and Dan behind the scenes

Last week, I had a chance to chat with Swami Sivasubramanian, VP of database, analytics and machine learning services at AWS. He caught me up on the broad landscape of generative AI, what we’re doing at Amazon to make tools more accessible, and how custom silicon can reduce costs and increase efficiency when training and running large models. If you haven’t had a chance, I encourage you to watch that conversation.

Swami mentioned transformers, and I wanted to learn more about how these neural network architectures have led to the rise of large language models (LLMs) that contain hundreds of billions of parameters. To put this into perspective, since 2019, LLMs have grown more than 1000x in size. I was curious what impact this has had, not only on model architectures and their ability to perform more generative tasks, but the impact on compute and energy consumption, where we see limitations, and how we can turn these limitations into opportunities.

Diagram of transformer architecture

Transformers pre-process text inputs as embeddings. These embeddings are processed by an encoder that captures contextual information from the input, which the decoder can apply and emit output text.

Luckily, here at Amazon, we have no shortage of brilliant people. I sat with two of our distinguished scientists, Sudipta Sengupta and Dan Roth, both of whom are deeply knowledgeable on machine learning technologies. During our conversation they helped to demystify everything from word representations as dense vectors to specialized computation on custom silicon. It would be an understatement to say I learned a lot during our chat — honestly, they made my head spin a bit.

There is a lot of excitement around the near-infinite possibilites of a generic text in/text out interface that produces responses resembling human knowledge. And as we move towards multi-modal models that use additional inputs, such as vision, it wouldn’t be far-fetched to assume that predictions will become more accurate over time. However, as Sudipta and Dan emphasized during out chat, it’s important to acknowledge that there are still things that LLMs and foundation models don’t do well — at least not yet — such as math and spatial reasoning. Rather than view these as shortcomings, these are great opportunities to augment these models with plugins and APIs. For example, a model may not be able to solve for X on its own, but it can write an expression that a calculator can execute, then it can synthesize the answer as a response. Now, imagine the possibilities with the full catalog of AWS services only a conversation away.

Services and tools, such as Amazon Bedrock, Amazon Titan, and Amazon CodeWhisperer, have the potential to empower a whole new cohort of innovators, researchers, scientists, and developers. I’m very excited to see how they will use these technologies to invent the future and solve hard problems.

The entire transcript of my conversation with Sudipta and Dan is available below.

Now, go build!... '   


No comments: