Generative AI's JavaScript technology stack

They are difficult to understand without using new software infrastructure technologies. At least that's what the a16z infrastructure team discovered, and since many of us started our careers as programmers, we often learn by doing. This is especially the case with the wave of generative AI, which has come so fast and so dramatically that good documentation is often months behind code. So, to better understand the field, we've been building projects around large language models (LLMs), large image models, vector databases, and more.

In doing so, we noticed that since all of this is so new and changing so quickly, there really isn't a good framework to get started with quickly. Every project requires a bunch of boilerplate code and integrations. Frankly, it's a pain. So we set out to create a very simple selection of "Getting Started with AI" templates for those who want to try out the core technology but don't want to worry too much about ancillary issues like authentication, hosting, and tooling.

You can fork and deploy templates here. We'd love to hear your ideas and feedback to make the template even better.

insert image description here

Recommendation: Use NSDT editor to quickly build programmable 3D scenes

1. Components

Many of us are JavaScript/TypeScript lovers, so we chose the JavaScript stack as our starting point. Nonetheless, this framework can be easily modified to support other languages, and we plan to do so soon.

Here's a brief overview of the starter stack we built with longtime collaborator and open source enthusiast Tim Qi. The goal is to highlight the easiest path from pulling code from GitHub to running generative AI applications (images and text). It is designed to be easily extensible to more complex architectures and projects:

  • Authentication: Clerk
  • Application logic: Next.js
  • Vector database: Pinecone/Supabase pgvector
  • LLM orchestration: Langchain.js
  • Image model: Replicate
  • Text model: OpenAI
  • Deployment: Fly.io

For a more detailed overview of the emerging LLM stack, check out our previous article "Emerging Architectures for LLM Applications".

2. Model and reasoning

Model hosting is a pain, and largely an orthogonal problem to building applications. Therefore, we use OpenAI for text construction and Replicate for image inference. Replicate also offers text-based models (see how easy it is to run Vicuna), so you can use it instead of OpenAI if you want.

insert image description here

3. Identity verification

For starter frameworks, we usually don't bother to include auth. In this case, however, the models are so powerful and general that they have been the target of massive, organized efforts to gain free access. Developers often learn the hard way when their model provider unexpectedly receives a $10,000 bill. That's why we chose to include Clerk, which does the heavy lifting of bot detection and of course, provides full authentication support if you end up building more complex applications.
insert image description here

4. Vector database

LLMs require reliable long-term memory to hold state and work around context windows; this is handled by vector databases. Currently, Pinecone is the most mature and popular vector repository for the generative AI crowd. That said, we want to provide support for all use cases and preferences, so we've also included support for Supabase's pg-vector in the repository.
insert image description here

5. Deployment

For deployment, we use Fly.io because it is multi-regional, easy to manage, and provides a very general computing environment (anything that runs in a container).

Over time, many AI projects end up in multiple languages ​​and may require significant functionality on the backend, so Fly.io is a good compromise between a JavaScript-native hosting environment like Vercel or Netlify and traditional cloud . That said, the code can easily support other hosting environments if you so choose. Fly.io will soon also offer GPUs for those cases where you want to host your own models.

6. Roadmap

While we think the first iteration is a good starting point, we're fleshing out the tech stack with more options. Here is our roadmap:

  • Interactive CLI for create-ai-stack where developers can choose their own project scaffolding and dependencies
  • Transactional database for advanced use cases (e.g. keeping questions in Q&A, user preferences, etc.)
  • More options for vector databases and deployment platforms
  • A lightweight fine-tuning step for open-source models

Link to the original text: JS Technology Stack for Generative AI — BimAnt

Guess you like

Origin blog.csdn.net/shebao3333/article/details/132709998