New Infrastructure for the Generative AI Era

Generative artificial intelligence has taken the tech industry by storm. In Q1 2023, as hundreds of millions of users adopt apps like ChatGPT and GitHub CoPilot, investments in next-generation AI startups topped $1.7B. Tech-leading companies are scrambling to develop their own generative AI strategies, with many struggling to get applications into production. Even the most cutting-edge engineering teams are challenged to train, deploy, and secure generative AI models in a safe, reliable, and cost-effective manner.

New infrastructure technology stacks for generative AI are emerging. We see great opportunity for new startups in this space, especially those that address the high costs associated with deploying models to production, data management, and model evaluation.

Recommendation: Use NSDT editor to quickly build programmable 3D scenes

1. New infrastructure for generative AI

insert image description here

2. Basic model

Foundation Models are trained on massive datasets and perform a wide range of tasks. Developers use the underlying model as the basis for powerful generative AI applications such as ChatGPT.
insert image description here

A key consideration when choosing a base model is open source vs closed source, below are the pros and cons of each model.

Open source:

  • Pros: Open source models are easier to customize, provide greater transparency into training data, and give users greater control over cost, output, privacy, and security.
  • Cons: Open source models may require more work to prepare for deployment, and also require more fine-tuning and training. While open-source models can be more expensive to set up, at scale companies have more control over costs than closed-source models, whose usage is less predictable and costs can get out of hand.

Closed source:

  • Pros: Closed-source models often provide hosting infrastructure and computing environments (such as GPT-4). They may also provide ecosystem extensions to extend model capabilities, such as OpenAI's ChatGPT plugin. Closed-source models can also provide more functionality and value "out of the box" because they are pre-trained and often accessible through APIs.
  • Cons: Closed-source models are black boxes, so users know very little about their training data, making it difficult to interpret and tune the output. Vendor lock-in can also lead to unmanageable costs—for example, GPT-4 usage is charged on a prompt-and-completion basis.

We think open source will be a more attractive option for enterprise teams building generative AO applications. As two Google researchers point out, the open-source model has advantages in terms of community-driven innovation, cost management, and trust.

3. Fine-tuning and training

Fine Tuning is the process of tuning the parameters of an existing model by training on a curated dataset to build "expertise" for a specific use case.

Fine-tuning can improve performance and reduce training time and cost by allowing developers to leverage pre-trained large models.

There are a variety of options for fine-tuning pretrained models, including open-source frameworks like TensorFlow and Pytorch, as well as secure end-to-end solutions like MosiacML. We want to emphasize the importance of labeling tools for fine-tuning — a clean, well-curated dataset can speed up the training process and improve accuracy!

Recently, we've seen a lot of activity around domain-specific generative AI models. Bloomberg has launched BloombergGPT, its proprietary LLM trained on financial industry-specific data. Hippocratic, a startup building its own LLM, trained with healthcare data to power consumer-facing applications, raised $50 million in private from Andreessen Horowitz and General Catalyst. Synteny AI is building a model trained on binding affinities between proteins to drive better drug discovery. We believe incumbents are well-positioned to fine-tune powerful models based on their own proprietary data and build their own AI edge.

4. Data storage and retrieval

Storage for long-term memory and data retrieval is a complex and costly infrastructure challenge, presenting an opportunity for startups to build more efficient solutions. Vector databases have emerged as a powerful solution for model training and subsequent retrieval and recommendation systems. This makes vector databases one of the hottest foundations for generative AI:

insert image description here

Vector databases can be used to support a variety of applications, including semantic search (a data search technique), similarity search (find similar data using shared features), and recommender systems. They also endow models with long-term memory, which helps reduce hallucinations (confident responses made by AI but not justified by training data).

We see a lot of opportunities for innovation here. There is no guarantee that current approaches to semantic search and retrieval from databases will continue to be the most efficient (speed and cost) and most effective (coverage); Cohere recently released its Rerank endpoint—a search and retrieval method that does not require migration to vector databases retrieval system. We've also seen teams using LLMs as inference engines attached to vector databases. We're excited to see the data storage and retrieval category evolve and more startups emerge.

5. Model Supervision: Monitoring, Observability, and Interpretability

The three terms related to supervision are often used interchangeably, however, they describe different steps in evaluating a model during and after production. Monitoring involves tracking performance, including identifying failures, outages, and downtime. Observability is the process of understanding whether performance is good or bad, or assessing the health of a system. In the end, interpretability is about interpreting the output—for example, explaining why the model made a certain decision.

Oversight is a staple of more traditional MLOps stacks, and incumbents such as Arize have begun building products for teams deploying generative AI models. However, black-box, closed-source models can be difficult to supervise and account for hallucinations without access to training data. Recent YC batches have spawned several companies to address these challenges, including Helicone and Vellum, highlighting the early developments in the field. Notably, both focus information on tracking latency and usage, suggesting that cost remains the biggest pain point for generative AI team building.

6. Model safety, security and compliance

As companies move generative AI models into production, model security and compliance will become increasingly important. In order for businesses to trust generative AI models, they need a set of tools to accurately assess the models for fairness, bias, and toxicity (generating unsafe or hateful content). We also believe that teams deploying models will need tools to help them implement their own guardrails.

Enterprise customers are also deeply concerned about threats such as sensitive data extraction, training data poisoning, and training data (especially third-party sensitive data) leakage. Notably, Arthur AI recently released its new product, Arthur Shield, the first firewall for LLM that prevents features such as instant injection (manipulating output with malicious input), data exfiltration, and toxic language generation.

insert image description here

We see a huge opportunity in compliance middleware. Companies need to ensure that their generative AI applications do not violate compliance standards (copyright, SOC-2, GDPR, etc.). This is especially important for team building in highly regulated industries such as finance and healthcare. We’re excited to see innovation from startups as well as incumbents—for example, our cowboy company, Drata, is well-positioned to integrate or build capabilities for generative AI model compliance.

7. Conclusion

We believe that generative AI will bring huge efficiency gains to companies and create huge new company opportunities in the infrastructure space. The two biggest bottlenecks to adoption are cost and security. Infrastructure startups that embrace these core value pillars will be well-positioned to succeed.

We also see open source playing an important role in generative AI infrastructure. Startups using this model will more easily gain the trust of users and benefit from the innovation and support of the open source community.


Original Link: A New Infrastructure Stack for Generative AI—BimAnt

Guess you like

Origin blog.csdn.net/shebao3333/article/details/132709224