From hype to impact: re:Invent 2023 unveils generative AI

关键字: [Amazon Web Services re:Invent 2023, Amazon Bedrock, Generative Ai Architecture, Foundation Models, Retrieval Augmented Generation, Responsible Ai, Cloud Infrastructure]

Number of words in this article: 3100, reading time: 16 minutes

video

If the video cannot be played normally, please go to bilibili to watch this video. >> From Hype to Impact: Building a Generative AI Architecture_bilibili_bilibili

Introduction

Generative AI represents a paradigm shift in how companies operate today. Generative AI enables developers to reimagine customer experiences and applications while transforming nearly every industry. Enterprises are rapidly innovating to create the right architecture for scaling generative AI safely, affordably, and responsibly to deliver business value. Learn how leaders are modernizing their data foundation, selecting industry-leading foundational models and deploying on purpose-built accelerators to realize the full potential of generative AI.

Highlights of speech

The following is the essence of this speech compiled by the editor for you. It has a total of 2,800 words and takes about 14 minutes to read. If you want to know more about the content of the speech or watch the full text of the speech, please watch the full video of the speech or the original text of the speech below.

Francesca Vasquez, Vice President of Professional Services at Amazon Cloud Technology, warmly welcomed attendees of re:Invent 2023. She introduced the exciting topic of generative AI and highlighted the amazing capabilities this technology has demonstrated in various industries, such as creating realistic images and videos, writing stories, poetry, and code. Vasquez emphasized that generative AI is indeed a vast opportunity that will fundamentally change the way businesses, consumers, and everyone operates.

While acknowledging the huge hype surrounding generative AI, Vasquez outlined her goals for the summit - to turn that hype into real, measurable impact, demonstrating how companies can leverage generative AI to build production-ready Architecture to enable meaningful innovation. She explains in detail why we have now reached a tipping point for generative AI, thanks to three key factors: first, the vast amounts of data available today provide generative AI models with the vast amounts of training data they need to learn effectively; Secondly, the highly scalable computing power of the cloud makes it possible to train complex generative AI models, which was previously difficult to achieve; finally, significant advances in machine learning methods and technologies, especially deep learning capable architectures like transformers, A breakthrough has been made in the technical capabilities of developing powerful generative AI models.

Vasquez explained that these machine learning advances are what make generative AI possible. She contrasted this with traditional machine learning, where models take months of expensive manual data preparation, annotation and training for a system that can only perform a single specific task.

We can now leverage vast amounts of data to capture and represent knowledge in more advanced ways. The vast neural network models that drive leading generative AI applications are called foundational models. These base models are driven by a transformer architecture that enables them to be pre-trained on large-scale unlabeled data. This makes the model immediately usable for a variety of general-purpose tasks, and can also be easily adapted to specific domains or industries, requiring only relatively little data.

Vasquez details how to interact with these powerful underlying models through simple prompts, clear descriptive instructions that guide the model to produce the precise output we expect. She pointed out that prompts have become a new user interface paradigm for generative AI, allowing us to specify what we want the model to create, the desired output format, the contextual data provided, etc.

Distilling key insights from the successes of early adopters, Vasquez summarizes five core design principles that are driving generative AI architectures to have real impact:

First, implementing generative AI is far simpler than expected, allowing developers to get started quickly without specialized machine learning expertise.

Second, customers have and need choice in models, frameworks, and services to drive greater innovation, deliver superior experiences, and have broader business impact.

Third, data remains the ultimate competitive advantage, so developing a strong data cloud strategy is critical to maximizing the value of generative AI.

Fourth, as with any advanced technology, security should be considered a primary concern and priority from the outset.

Finally, scalable, reliable generative AI is impossible without cloud computing. Cloud computing provides the core infrastructure and services needed to build, deploy, and effectively run generative AI applications.

To demonstrate how Amazon Cloud Technologies can simplify the use of generative AI, Vasquez demonstrated Amazon Bedrock. This fully managed service provides a choice of high-performance base models through a single API call. Developers don’t need to manage infrastructure—just point to the Bedrock API and instantly take advantage of the power of state-of-the-art generative models.

Bedrock also makes it easy to customize your own private model, just point to some tagged examples in Amazon S3. The service takes care of training models tuned to your specific domain or use case data. Vasquez explained how Bedrock supports search-enhanced generation, enabling models to combine your organization's data and documents to produce highly customized, relevant work products.

Building on the foundation of Bedrock provided by Amazon Cloud Technologies, Vasquez highlighted other easy-to-enter areas for building with generative AI, such as Amazon CodeWhisper and Party Rock. CodeWhisler provides developers with AI-generated code suggestions, while Party Rock provides a fun platform for quickly creating generative AI applications.

Regarding the key need for customer choice, Vasquez detailed how Amazon Cloud is supporting access to a growing number of base model options, including options from partners such as Anthropic, Cohere, Stability AI, as well as Amazon Cloud’s own models, named It is the basic model of Amazon TITAN. Bedrock provides the flexibility to choose the specific size model that best suits your application's latency and cost needs.

Vasquez mentioned that as text vector embeddings become the primary way to prepare data for retrieval enhancement generation, they are becoming increasingly important in preparing data. That's why Amazon Web Services now offers embedded support directly within services like Amazon OpenSearch and Amazon PostgreSQL.

Although very powerful, Vasquez explained that the base model still requires manual programming to complete complex multi-step tasks, such as booking a flight. This is because they cannot fulfill the request by performing a specific action. This is where AI agents come in – they extend the underlying model’s reasoning capabilities by calling APIs and integrating with backend systems. A fully managed agent like the one found in Amazon Bedrock makes it easier to create and coordinate plans to achieve powerful goals.

Vasquez emphasized that promoting responsible AI development is Amazon Cloud Technology’s top priority. They take a human-centered approach and integrate responsible practices throughout the machine learning lifecycle. For example, Bedrock provides Guardrails functionality for implementing security that is consistent with organizational policies, providing further control over model behavior. More broadly, Bedrock provides many features to support security, privacy and governance requirements. The service is HIPAA compliant and GDPR compliant. User Content is encrypted in transit and at rest. Custom encryption keys can be used for greater control. Vasquez reiterates that your content will not be used to improve the base model or shared externally.

Vasquez then introduced Ori Goshen, co-founder and co-CEO of AI21 Labs, sharing how they are pioneering state-of-the-art natural language models and AI systems to transform various industries.

Goshen began by explaining that AI21 Labs’ mission is to build reliable and powerful AI systems to empower professionals and businesses. Their models are embedded in thousands of applications spanning industries such as retail, financial services, healthcare and education. Looking two years from now, Goshen predicts the focus of the conversation will shift from large language models to AI systems. He detailed how AI21 is leading this shift with its Jurassic language models, which come in two sizes—Jurassic-Ultra for complex tasks and Jurassic-1M for the best balance of quality and cost.

These models excel in text generation, question answering, summarization, and classification. For example, a leading sports retailer uses Jurassic to generate customized product descriptions, giving them a unique tone, length, and purpose.

Goshen demonstrated how easy it is to access Jurassic through Amazon Bedrock. The user only needs to specify the model ID and provide a prompt. AI21 also provides a Python SDK for developers’ convenience.

Based on common customer use cases, AI21 develops task-specific models tailored for tasks such as deterministic question answering, summarization, rewriting, and more. Goshen explained how these models go beyond the base model to handle complexities such as retrieval enhancement generation, input validation, and output evaluation.

He emphasized the advantages of their context-based answer model in answering specific questions. This model validates responses to ensure accuracy and comprehensiveness, thereby reducing common problems such as hallucinations. This is crucial for practical purposes. The model can accept natural questions and contextual data without requiring hint engineering or fine-tuning.

According to Goshen, customers are excited about the high accuracy, lower cost, reduced latency, and ease of use of their task-specific models. He noted that these models are ideal for production environments because they only work for defined tasks and cannot be deviated or manipulated.

For example, Goshen recounted the case of the banking industry, which transformed their customer support with next-generation chatbots. The chatbot architecture combines the Jurassic model for categorization and restatement, the Contextual Answers model for giving informed responses, and the integration of other systems to provide superior customer service.

Finally, Goshen praised the close engineering collaboration between AI21 Labs and Amazon Web Services. Customers can access AI21’s generative capabilities while ensuring security, privacy and governance needs are met. He emphasized that by delivering large, consistent workloads, Amazon Web Services can guarantee throughput and scale. Goshen reiterated that the shift from experimentation to adoption will drive increased demand for reliability and production readiness, and that AI systems designed for specific tasks are best suited to meet this need.

Vasquez returned to the stage to discuss how to build modern, scalable data architectures to maximize the value of generative AI. She explained that the cloud has enabled companies to collect more data than ever before—terabytes or petabytes in many cases. To effectively leverage all this data, Vasquez emphasized that your strategy needs to provide the right tools for each use case, connect all data sources, and maintain end-to-end governance.

Amazon Web Services offers a broad set of data services designed to support the complete data lifecycle. Vasquez points out that when you invest in building a strong data foundation, the rewards when deploying generative AI capabilities will be huge.

She emphasized that while the underlying model is powerful, detail on specific organizations and customers is still limited. Data is critical to align generative AI with real needs and realize its potential. Vasquez lists three main methods for integrating data with a base model: developing custom custom models from scratch; fine-tuning the base model on existing data; and using retrieval enrichment generation at runtime to provide personalized data. Vasquez emphasized the use of standard methods to prepare text data, such as vector embeddings, for retrieval augmentation generation. Amazon Cloud Technology provides embedded functionality directly within its services, such as Amazon OpenSearch and Amazon PostgreSQL. However, Vasquez reminds us that even if the underlying models perform well at inference and generation, they cannot perform operations such as booking a flight. This is where artificial intelligence agents come into play, they can perform tasks through integration with APIs and systems. Fully managed agents extend the capabilities of the base model to create and coordinate multi-step plans. Responsible AI is a top priority, and Vasquez highlighted features like Guardrails that allow organizational policies to be implemented to control model behavior. Bedrock is designed to support robust security, privacy and governance requirements out of the box. Finally, Vasquez welcomes Moses Nascimento, Chief Technology Officer of Itaú Unibanco, to share how they are building a world-class data and AI infrastructure to drive generative AI innovation.

Nascimento first introduced the background of Itaú, which is Latin America's most valuable bank with more than 70 million customers, 100,000 employees, and a 100-year history of serving Brazil. He joined Itaú five years ago to drive its digital transformation and modernize its data and analytics platform. This includes migrating operations to Amazon Cloud Technologies and adopting cloud computing in the Brazilian financial sector as regulations allow. Itaú has entered into a strategic partnership with Amazon Cloud Technologies to grow its business and build next-generation digital infrastructure and products. In just four years, they have modernized more than 50% of their systems, representing 70% of their most competitive services, while reducing incidents by 98%.

Nascimento elaborated on how they are comprehensively modernizing their data architecture and management to enable agility, scalability and drive decentralized innovation across their business units. Their data network design provides a control plane for security, governance, and privacy, eliminating the need for ETL and enabling data to be accessed once and used in multiple places. In terms of storage, Amazon Cloud Technology's services such as Amazon S3, Amazon EMR and Amazon Redshift provide the foundation. Business teams can analyze and explore by using out-of-the-box tools like Amazon Athena and Amazon QuickSight.

Based on this data, Itaú developed a scalable AI platform called Yara, which provides a complete machine learning lifecycle framework and tools, including feature libraries, risk analysis, deployment coordination, and observability. Yara leverages Amazon SageMaker and other Amazon Cloud technology building blocks to automate and accelerate AI innovation.

When generative AI emerged, Itaú followed the same principles to create a sandbox environment, including a configured control plane, a preparation and integration data layer, and a modular application layer. This allows rapid experimentation in a controlled and secure manner across various business units. Nasimiro shared an example of how their legal team uses AI to interpret more than 70,000 processes per month with over 99% accuracy. For investment clients, they built an AI model to explain events that could impact a portfolio and suggest mitigation strategies. Their emphasized goal is to continuously improve their platform to provide a better everyday banking experience.

Vasquez emphasized that addressing the potential of generative AI requires building on proven, cloud-scale infrastructure. She introduced in detail the global coverage of Amazon Cloud Technology - 102 availability zones and more than 450 network service points in 32 geographical regions around the world. This unparalleled scale provides the foundation to build, deploy, and run the most demanding applications.

Amazon Cloud Technologies has invested deeply in custom hardware to support machine learning over the past decade. For example, Amazon Cloud Technology took the lead in launching cloud GPU 12 years ago and today can support large-scale model training of up to 10,000 GPUs. New dedicated chips such as Trainium 2 further achieve performance breakthroughs and provide next-generation EC2 instance power for training basic models.

Vasquez then introduced Jens Schule, head of offline architecture at BMW, to discuss their cloud architecture and use of generative AI.

A BMW representative, Mr. Schule, began by describing the company's passion for delivering exceptional experiences, whether driving their vehicles or using digital services. He emphasized the need for digital experiences to be tightly integrated with vehicle and cloud backends. He shared some impressive statistics demonstrating the scale of their connected vehicle ecosystem worldwide: 20 million connected vehicles worldwide generating 12 billion requests per day, representing 110TB of data, while maintaining 99.95% reliability . The upcoming new model architecture will double these figures when launched in two years' time.

Schule explains that the only way to optimize this large and complex backend is through extensive automation. He detailed their flywheel approach—using tools like Amazon Cloud Trust Advisor and Amazon Cloud Config to continuously measure cloud resources, understand optimization opportunities, and then implement fixes programmatically across all accounts. However, during the insights and implementation phases, the process broke down. So BMW built a generative AI assistant on Amazon Bedrock to analyze findings, explain problems, suggest solutions, and even implement fixes using Python and Terraform.

Schule demonstrated the robot on site. In response to a question about low EC2 utilization, the assistant analyzes the account, explains the problem, and provides Python code to resize the instance. Additionally, it is willing to implement fixes autonomously and provide callback confirmation upon completion.

He gave an overview of the serverless architecture, highlighting how Bedrock ensures that no data leaves their account during training or inference. BMW's multi-agent design allows combining specialized models for each task. Integration with Amazon Cloud Technology Documents and access to BMW internal data through retrieval-enhanced generation enables the assistant to provide contextual and relevant responses.

Schule concluded that the assistant is helping BMW expand cloud governance, reduce manual governance overhead for developers, promote company-wide learning, and be easy to maintain and scale. Other ongoing projects include analyzing CloudWatch logs and directly inspecting resource configurations.

Finally, Schule reiterated that BMW has been an early and enthusiastic company in the use of artificial intelligence in its value chain. This generative robot is an example of turning this into real impact. He hinted that there will be more innovation as BMW continues to use AI to improve workflows and products.

Overall, Vasquez emphasized how Amazon Cloud Technology provides a range of services to support the adoption of generative AI, including basic models, infrastructure, training and tools. After listening to inspiring customer stories, she was excited about the potential for attendees to innovate and transform using generative AI on Amazon’s cloud technology. Next, Vasquez thanked everyone for participating and concluded his presentation at 2023re:Invent with an insightful session on how generative AI can be used to achieve real business impact.

Here are some highlights from the speech:

The leader enthusiastically invited the audience to participate in re:Invent in 2023.

Today, generative AI is transforming entire industries, with the ability to create stunning content in a variety of media.

The leader explains in detail how to onboard users so that they can explicitly direct the underlying model to generate the desired output.

The key, they argue, is to build specialized AI systems, not just make more powerful models.

I am very proud to be part of Itau Bank’s digital transformation journey, helping this century-old company with more than 70 million customers and one of Latin America’s most valuable brands modernize its data and analytics platform.

Leaders also talked about how Amazon Fire TV and Alexa in Audi cars are providing a better driving experience by allowing customers to control via voice.

They are looking forward to how customers and partners can leverage generative AI on Amazon cloud technology.

Summarize

Francesca Vasquez, Vice President of Amazon Cloud Technology, emphasized in her re:Invent 2023 speech that general artificial intelligence has the potential to change various industries and emphasized that Amazon Cloud Technology is ready for this.

The speech introduced the development of deep learning model architectures that are driving progress in general artificial intelligence. By using prompts to talk to the underlying model, complex outputs can be generated from complex inputs.

Amazon Cloud Technology provides services such as Amazon Bedrock and SageMaker JumpStart in this area. Bedrock simplifies access to base models, while JumpStart enables more advanced customization. Data is the source of differentiation, and RAG (search extension generation) can organize the model's answers based on the organization's data.

Promoting responsible AI is also one of the important topics of Amazon Cloud Technology. Bedrock includes Guardrails, a policy-compliant security feature.

The speech introduced cases such as AI21 Labs, Itaú and BMW, demonstrating that Amazon Cloud Technology's services and infrastructure are accelerating the implementation of general artificial intelligence. Amazon Cloud Technology will fully support customers to succeed in this field.

Original speech

From hype to impact: Building a generative AI architecture-CSDN博客

Want to know more exciting and complete content? Visit re:Invent official Chinese website now!

2023 Amazon Cloud Technology re:Invent Global Conference - Official Website

Click here to get the latest global product/service information from Amazon Cloud Technology with one click!

Click here to get the latest product/service information from Amazon Cloud Technology China with one click!

Register an Amazon Cloud Technology account now and start your cloud journey!

[Free] Amazon Cloud Technology "Free trial of more than 100 core cloud service products"

[Free] "Free trial of more than 40 core cloud service products" of Amazon Cloud Technology China

Who is Amazon Cloud Technology?

Amazon Cloud Technology (Amazon Web Services) is the pioneer and leader of global cloud computing. Since 2006, it has been characterized by continuous innovation, technology leadership, rich services, and wide application And well-known in the industry. Amazon Cloud Technology can support almost any workload on the cloud. Amazon Cloud Technology currently provides more than 200 full-featured services, covering computing, storage, network, database, data analysis, robotics, machine learning and artificial intelligence, Internet of Things, mobile, security, hybrid cloud, virtual reality and augmented reality, media , as well as application development, deployment and management; the infrastructure covers 99 availability zones in 31 geographical regions, and plans to build 4 new regions and 12 availability zones. Millions of customers around the world, from startups, small and medium-sized enterprises, to large enterprises and government agencies, trust Amazon Cloud Technology. They use Amazon Cloud Technology services to strengthen their infrastructure, improve agility, reduce costs, accelerate innovation, and enhance competitiveness. Achieve business growth and success.

Guess you like

Origin blog.csdn.net/goandstop25/article/details/134715486