Agents change the rules of the game, Amazon cloud technology generative AI enables basic models to accelerate workflow

Recently, Stability AI officially released the next-generation Vincent graph model——Stable Diffusion XL 1.0 This version 1.0 is Stability AI's flagship graph model and the most advanced open source graph model.

Among the current open image models, SDXL 1.0 has the largest number of parameters. According to the official statement, a new architecture is adopted this time. The parameter scale of the basic model reaches 3.5 billion, and there is also a refined model with a parameter size of 6.6 billion. And such a powerful image generation model can already be accessed with one click on Amazon Cloud Technology Amazon Bedrock!

The basic model is fully updated

Just last week, Amazon Cloud Technology released a large wave of new basic models. In addition to the SDXL 1.0 just mentioned, Amazon Bedrock has also added support for the Cohere base model and ChatGPT's strongest competitor, Anthropic's Claude 2.

ecf21220f48b4138aba051de60749e5a.png

 

Command, a large language model developed by Cohere, is a model that can be trained by users' personalized commands. It focuses on providing three major AI capabilities: text search, text classification and text generation. In addition, the processing capacity of Claude 2 launched by Anthropic has been updated to 100,000 tokens. Compared with the previous version, Claude 2 has significantly improved in mathematics, code, and reasoning ability. At the same time, developers can also use Amazon SageMaker Jumpstart, a machine learning center, to develop various popular open source models with one click. For example, Meta's latest Llama 2, Falcon and Flan hosted by the world's largest open source community Hugging Face, etc.

 

Agents change the rules of the game

However, although the basic model has a strong generalization ability in various tasks, with the continuous expansion of application scenarios, it is difficult to complete some complex tasks only by the model itself. The explosion of AutoGPT some time ago has given the academic and industrial circles a new direction of exploration - an agent that integrates a large language model.

In summary, the agent can run the loop through the simplest form, and in each iteration, it will generate autonomous instructions and operations. As a result, they do not need to rely on humans to guide the conversation, and they are also highly scalable.

Amazon cloud technology has also carried out its own exploration in this field, and innovatively proposed a new Amazon Bedrock Agents. Based on the Agents function provided by Amazon Bedrock, developers can easily create various generative AI applications to complete complex tasks and provide the latest answers based on proprietary knowledge sources. What used to take hours of coding to implement, now requires only a few clicks and Agents can automatically decompose tasks and create plans without any manual coding. In this way, generative AI applications can be created in minutes.

So, how does Amazon Bedrock Agents allow the underlying model to speed up the workflow? Specifically, it can be divided into the following four steps:

● Step 1: Define instructions and orchestration to decompose complex tasks into multiple steps

● Step 2: Retrieval Augmentation Generation (RAG), configure FM to interact with company data

● Step 3: Complete the interaction and perform API calls to satisfy user requests

● Step 4: Host securely in the cloud

Amazon Bedrock Agents can link to company data through a simple API, convert it into a machine-readable format, and generate accurate responses. Then automatically call the API to meet user requests.

 

Nvidia H100 strongest blessing

However, the continuous improvement of the performance of the basic model brings about hundreds of trillions of parameters. This skyrocketing complexity also dramatically increases model training and fine-tuning time—the latest LLMs take months to train. Meanwhile, a similar trend is emerging in the HPC space. With the improvement of accuracy, the data set collected by users has reached the Exabyte level. In order to meet the demand for high-performance and scalable computing power, Amazon Cloud Technology has launched a new Amazon Elastic Compute Cloud (EC2) P5 instance equipped with Nvidia's most powerful GPU-H100.

Amazon EC2 P5 instances not only reduce training time by up to 6x (from days to hours), but also reduce training costs by up to 40% compared to the previous generation. Specifically, the Amazon EC2 P5 instance is equipped with a total of 8 NVIDIA H100 Tensor Core GPUs, equipped with 640 GB of high-bandwidth GPU memory, as well as a third-generation AMD EPYC processor, 2 TB of system memory, 30 TB of local NVMe storage, and Up to 3200 Gbps total network bandwidth.

Fully configured performance that powers the most demanding, compute-intensive generative AI applications, including question answering systems, code generation, video and image generation, speech recognition, and more, ideal for training and running increasingly complex LLMs and CVs Model.

Based on the new Amazon EC2 P5 instances, users can explore previously inaccessible problems and iterate on solutions faster. In addition, in order to meet users' needs for large scale and low latency, Amazon Cloud Technology has also launched the second-generation EC2 UltraClusters equipped with Amazon EC2 P5 instances. As the largest scale ML infrastructure in the cloud, EC2 UltraClusters deliver up to 20 exaflops of aggregate computing power with low latency across more than 20,000 NVIDIA H100 GPUs.

 

Insert an "external brain" into the model

From the construction of Agent, it is not difficult to see that all applications built based on large models need to be built on RAG to obtain real-time data. And this technology is the basis for the vector database to play an important role in AI applications.

At the Amazon Cloud Technology Summit, the Amazon OpenSearch Serverless vector engine was launched for the first time. Through this tool, developers can easily use the vector database to quickly build a search experience based on large models. Overall, the Amazon OpenSearch Serverless vector engine introduces simple, scalable, and high-performance vector storage and search capabilities. Developers can quickly store and query billions of vector embeddings generated by various ML models (including those provided by Amazon BedRock), with response times in milliseconds.

Currently, generative AI is exploding, and enterprises in all vertical fields are turning to this craze and exploring ways to change user experience and interact with digital platforms by integrating advanced dialogue to generate AI applications. This tool from Amazon Cloud Technologies enhances ML search and generative AI through the use of vector embeddings.

The vector embeddings are trained on user-private data and can represent the semantic and contextual properties of information. This has the advantage of being able to process the user's query in time to find the closest vector and combine it with other metadata without relying on external data sources or other application code to integrate the results.

It is worth mentioning that the vector engine is built on Amazon OpenSearch Serverless, so there is no need to worry about the size, adjustment and expansion of the back-end infrastructure. All data is persisted in Amazon Simple Storage Service (Amazon S3). As the number of vectors grows from a few thousand in prototyping to hundreds of millions or more in production, the vector engine scales seamlessly without re-indexing or reloading data to scale the infrastructure.

Additionally, the vector engine provides independent computation for indexing and search workloads, so developers can seamlessly ingest, update, and delete vectors in real time while ensuring user experience is not impacted by query performance. With Vector Engine support for Amazon OpenSearch Serverless, developers will have a simple, scalable, and high-performance solution for building machine learning-enhanced search experiences and generative AI applications without having to manage vector database infrastructure.

 

Global Generative AI Leader

With the explosion of data volume, the availability of highly scalable computing power, and the advancement of machine learning technology, generative AI is enough to change every industry. Therefore, more and more enterprises hope to quickly adopt the latest technology and create value. Selecting the right model, customizing it securely with corporate data, and integrating it into an application is a complex process that takes a lot of time and requires highly specialized knowledge. Precisely, Amazon Cloud Technology Amazon Bedrock simplifies this process, accessing first-class underlying models through simple APIs.

With Amazon Bedrock Agents, a fully managed service, developers can easily create generative AI-based applications to complete complex tasks for a variety of use cases. Among them, Vector Database can help developers' applications store data in real time, recall information in time, and provide a better user experience. Amazon EC2 P5 instances save a lot of time and computing power for model training. Based on the above innovations, it truly demonstrates that Amazon Cloud Technology is an end-to-end leader in generative AI, helping enterprise developers unleash the potential of generative AI and create value.

At the same time, Amazon Cloud Technology continues to lower the threshold of generative AI, and is also a leader committed to GenAI inclusiveness.

Some time ago, Amazon Cloud Technology just announced the availability of the programming assistant Amazon CodeWhisperer, which can use the underlying basic model to help developers improve work efficiency. It can generate code suggestions in real time based on comments left by developers using natural language and historical code in IDE (Integrated Development Environment).

This time, Amazon CodeWhisperer is integrated with Amazon Glue Studio Notebooks for the first time, which can help users optimize their experience and improve development efficiency. With Amazon Glue Studio Notebooks, developers write tasks in a specific language, and Amazon CodeWhisperer recommends one or more code snippets that can complete the task.

Amazon CodeWhisperer is optimized for the most commonly used APIs, such as Amazon Lambda or Amazon Simple Storage Service (Amazon S3), making it an excellent coding companion for developers building applications. In addition, Amazon Cloud Technology also provides 7 free skills training courses to help developers use generative AI. Among them, they also teamed up with Wu Enda to launch a new course "Building Generative AI with Large Language Models".

● It has been applied in the medical field

This year's wave of large AI models has also inspired people to explore the application of generative AI in the medical industry. For the application of AI in the medical industry, Amazon Cloud Technology has also taken action, releasing a new service for healthcare software providers-Amazon HealthScribe.

Amazon HealthScribe automates the drafting of clinical documents by using machine learning models using generative AI and speech recognition, helping clinicians transcribe and analyze their conversations with patients. Its natural language processing capabilities can also extract complex medical terms, such as drugs and medical conditions, from conversations. Medical history, key points, and reasons for seeing a doctor are all available. The AI ​​function in Amazon HealthScribe is powered by Amazon Bedrock. Through pre-trained models, users can build generative AI from startups and Amazon itself.

It can be said that as the pioneer of global cloud computing, Amazon Cloud Technology has seen the potential and importance of generative artificial intelligence in the current wave of AI. Generative AI can transform every application, every business, and even every industry. Advances in data processing, computing power, and machine learning are accelerating the transition from experimentation to deployment for many enterprises.

Through services like Amazon Bedrock and partnerships with industry leaders, the company is democratizing access to generative artificial intelligence. On the basis of continuous innovation, Amazon cloud technology is enabling developers and the world to reimagine experiences and bring the best products to life.

Guess you like

Origin blog.csdn.net/m0_66395609/article/details/132102724