Read "Generative AI" in one article

I. Introduction

This article is based on Google's: "Introduction to Generative AI" and organized with the help of ChatGPT to help you understand the concept of generative AI.
image.png
image.png
It mainly includes 4 parts:

  • Definition of generative AI
  • How Generative AI Works
  • Taxonomy of Generative AI Models
  • Applications of Generative AI

2. Introduction to generative AI

2.1 Definition of generative AI

Artificial intelligence is not equal to machine learning

Artificial intelligence is the broad field concerned with giving machines the ability to mimic human intelligence. It involves enabling computer systems to perform tasks similar to human intelligence, such as speech recognition, image recognition, natural language processing, and decision making.
image.png

Artificial intelligence aims to equip machines with human-like reasoning, learning, problem-solving and decision-making abilities .
image.png

Machine learning is a branch of artificial intelligence that uses data and statistical models to allow machines to automatically learn and improve. The goal of machine learning is to design and develop algorithms that enable computer systems to learn from data without being explicitly programmed. By training models, machine learning enables machines to recognize patterns, make predictions, and make decisions .

In short, artificial intelligence is a broader concept that encompasses the goals and techniques of making machines possess human intelligence. Machine learning is an approach to artificial intelligence by having machines learn from data and automatically adapt models to accomplish tasks. Therefore, machine learning is a subset of artificial intelligence, but artificial intelligence is not limited to machine learning, but also includes other methods and techniques.

Supervised and Unsupervised Learning in Machine Learning


Supervised learning and unsupervised learning are two different learning methods in machine learning.
image.png
Supervised learning is a learning method that trains a model by using labeled training data . In supervised learning, the training data consists of input features and corresponding labels or outputs. The model is able to make predictions on new unlabeled data by learning the relationship between input features and labels. Common supervised learning algorithms include linear regression, logistic regression, decision trees, and support vector machines, among others. Supervised learning is suitable for tasks that require classification, regression, and prediction.

image.png
Unsupervised learning is a learning method that automatically discovers patterns and structures in unlabeled data without labels . In unsupervised learning, the training data does not contain label information, and the model needs to discover hidden structures and patterns through techniques such as data clustering, dimensionality reduction, or association rule mining. Unsupervised learning can help us understand the distribution of data, find outliers, perform data visualization and feature extraction, etc. Common unsupervised learning algorithms include clustering algorithms (such as K-means clustering), principal component analysis (PCA) and association rule mining, etc.

image.png

In short, supervised learning uses labeled training data to train a model and make predictions based on known relationships between inputs and outputs. Unsupervised learning is learning on unlabeled data without labels, and gaining insight and understanding by discovering patterns and structures in the data. These two learning methods play an important role in solving different types of problems and application scenarios.

deep learning

Deep learning is a branch of machine learning.
image.png
Machine learning is a method of enabling computer systems to learn from data through algorithms and models . Its goal is to enable machines to automatically discover patterns, make predictions, and make decisions from data without being explicitly programmed. Machine learning algorithms can learn given input data and optimize performance by adjusting the parameters of the model. Common machine learning algorithms include linear regression, decision trees, support vector machines, and random forests, among others.
image.png

Deep learning is a specific field of machine learning that utilizes artificial neural network models for learning and training . Deep learning models consist of multiple layers (called layers of a neural network), each of which transforms and represents input data. These network layers map input data to output results through a series of nonlinear transformations. The core of the deep learning model is the deep neural network (Deep Neural Network, DNN), which can be trained by a large amount of labeled data, so as to achieve highly accurate prediction and classification tasks.
image.png

In general, machine learning is a more general approach to learning that can use various algorithms and techniques, while deep learning is a specific branch of machine learning that uses deep neural networks to achieve learning and prediction. The main advantage of deep learning is that it can automatically learn higher-level feature representations from raw data, thus providing more accurate and complex models. However, deep learning usually requires larger-scale data and higher computing resources for training, which is more complicated than traditional machine learning algorithms.

The relationship between generative AI and deep learning

image.png
Generative AI is a branch of deep learning.

Discriminative and Generative Models

image.pngDiscriminative Model and Generative Model are two different types of models in machine learning. Their main difference lies in the way they model data and their application fields.
image.png
A discriminative model is a model that directly models conditional probabilities. It is primarily concerned with predicting the probability distribution of an output class or label given input data. Discriminative models classify new input data by learning the relationship between input and output to establish a decision boundary. Common discriminative models include logistic regression, support vector machines, and deep neural networks. Discriminative models are commonly used for tasks such as classification, regression, and labeling.
image.png
A generative model is a model that models a joint probability distribution. It not only learns the relationship between input and output, but also the process of generating the input data. Generative models can generate new sample data by learning the distribution of data and the relationship between features. Common generative models include Gaussian Mixture Model (GMM) and Generative Adversarial Network (GAN). Generative models are commonly used for tasks such as generating new images, language models, and data augmentation.
image.png
The choice of discriminative and generative models depends on the specific problem and task requirements. Discriminative models pay more attention to the accuracy of classification and prediction, and can directly model the relationship between input and output. The generative model pays more attention to the data generation process, which can simulate the distribution of data and generate new samples. Generative models can be used to generate new data, but may not be as accurate as discriminative models on classification and prediction tasks.
image.png
In general, discriminative models focus on the relationship between input and output and are used for tasks such as classification and prediction. Generative models focus on the data generation process and can generate new sample data. The choice of discriminative model or generative model should be decided according to the needs of specific problems and task goals.

Supervised, semi-supervised and unsupervised learning for generative AI

image.png
In traditional supervised and unsupervised learning, training data and labeled data are fed to the model, and prediction, classification and clustering can be made.

image.png
Supervised, semi-supervised, and unsupervised learning of generative AI. Training data, labeled data, and unlabeled data are given to the basic model, and then new content is generated, and finally text, code, and pictures are generated.

The difference between generative AI and traditional programming and neural networks

image.png
Traditional programming requires hard coding to describe some characteristics of cats.
image.png
The neural network algorithm can learn whether it is a cat sample, and then you give a picture and it can judge whether it is a cat.
image.png
Generative models such as LaMDA, PaLM, and GPT can directly ask what is a cat after feeding a lot of content? It speaks the answer it knows.

Definition of generative AI

image.png
What is generative AI?

  • Generative AI is a branch of artificial intelligence that generates new content based on what has been learned.
  • The process of learning from existing content is called training, and the result of training is the creation of a statistical model.
  • When the user gives a prompt word, the generative AI will use the statistical model to predict the answer and generate new text to answer the question.

Classification of Generative Models

image.png

[Generative language model] is a technology based on natural language processing, which generates new text by learning the laws and patterns of language . It can generate coherent sentences or paragraphs based on previous contextual and semantic understanding. Generative language models are trained on large-scale text data, such as news articles, novels, or web content. By learning the relationship between words, phrases, and sentences in the text, generative language models can automatically generate new, logically and grammatically correct texts, such as articles, dialogues, and poems.
[Generative image model] is a computer vision-based technology that generates new images by learning the features and structures of images . It can learn the feature representation and statistical laws of the image from the previous training data, and then use this knowledge to generate new images. Generative image models are usually trained on large-scale image datasets, such as natural images or works of art. By learning the texture, color, shape, and relationship between objects in images, generative image models can generate new images with visual realism or artistic style, such as natural landscapes, portraits, or abstract artworks.

image.png
Generative AI inputs pictures, and the output can be text (talking through pictures, visual question answering, picture search), pictures (super-resolution, picture modification) and video (animation).

Super resolution is the English expression of super resolution, which refers to the process of improving the resolution of the original image through hardware or software, and obtaining a high-resolution image through a series of low-resolution images.

image.png
The input of generative AI is text, and the output can be text (translation, summary, question and answer, grammar correction), picture (picture, video), audio (text to pronunciation), decision-making (playing games).

2.2 How Generative AI Works

image.png

Generative language models learn language patterns in the training data, and given some text, they will predict what follows.
image.png

image.png

image.png

image.png

Enter the user's input into the encoder and decoder of the Transformer model for processing, then process it in the generative pre-training model, and finally output the result to the user.

Pre-training:

  • Massive Data
  • billions of parameters
  • unsupervised learning

image.png
The model learns from large amounts of text data and tries to predict the next word or phrase. However, sometimes the model generates words or phrases that do not conform to grammatical rules or have unclear meaning, which are called "hallucinations".
image.png

Hallucinations can be regarded as errors or defects in the model generation process, which may be caused by insufficient training data, poor quality training data for the model, not giving the model enough context, and not giving the model enough constraints.

image.png
A cue word is a piece of text that is input to a large language model, which can be used in various ways to control the model's output.

image.png
Prompt word design is the process of creating prompts that generate desired outputs from large language models. As we mentioned before, generating AI depends heavily on the training data you feed it. It analyzes the schema and structure of the input data and generates content. So the quality of the input determines the quality of the output.

2.3 Types of Generative Models

image.png
Text-to-text generative models are designed to take in a text input and generate an associated text output. Such models can be used for tasks such as machine translation, text summarization, dialogue generation, story generation, etc. Generative models can learn the mapping relationship from input to output to generate new text with semantic and syntactic correctness.

Common application scenarios:

  • Machine Translation: Translating text from one language into another.
  • Text Summarization: Generate concise summaries or generalizations from long texts.
  • Dialogue Generation: Generate natural and fluid dialogues that can be used in virtual assistants or chatbots.
  • Story Generation: Automatically generate coherent, interesting stories or narratives.

image.png
A text-to-image generative model takes a textual description as input and generates a corresponding image output. This model can convert natural language descriptions into visual content for tasks such as image generation, image annotation, and image editing. By learning the semantic association between textual descriptions and images, the model can generate images that match textual descriptions.

Common application scenarios:

  • Image Generation: Generate images that match text descriptions.
  • Image annotation: convert image descriptions into natural language annotations.
  • Image editing: realize image editing through text commands, such as adding, modifying or deleting specific content.

image.png
Text-to-video or 3D generative models take a text input and generate a corresponding video or 3D model output. These models can be used for video generation, scene synthesis, 3D model generation and other tasks. The model can learn the transformation process from text descriptions to video sequences or 3D models, generating dynamic videos or stereo models that match the text descriptions.

Common application scenarios:

  • Video generation: According to the text description, a dynamic video matching it is generated.
  • Scene synthesis: Generate 3D scenes or virtual reality experiences from textual descriptions.
  • 3D model generation: Generate 3D models with specific properties or shapes based on text descriptions.

image.png
Text-to-task generative models aim to perform specific tasks based on textual input. These models can receive natural language instructions or questions and generate corresponding task execution results. For example, the Q&A generation model can receive questions and generate corresponding answers; the code generation model can receive natural language descriptions and generate corresponding code implementations. This model is able to translate textual instructions into specific operations for task execution.

Common application scenarios:

  • Question and answer generation: Generate corresponding answers or solutions based on questions.
  • Code generation: convert natural language description into code implementation.
  • Instruction execution: Execute specific tasks according to natural language instructions, such as image processing, data manipulation, etc.

image.png

image.png
Model Garden: Google Vertex AI has many basic models for language and vision to choose from.

image.png
Model Garden is an open source project of Google, which aims to provide researchers and developers with pre-trained machine learning models and related training and optimization techniques. These models cover many different machine learning tasks, such as image classification, object detection, and natural language processing.

Models in Model Garden fall into two categories: language models and vision models.

  1. Language model : This type of model can perform some specific language processing tasks, such as:
    • Extraction: This category of tasks includes Syntax Analysis, which understands the grammatical structure of text.
    • Classification: This type of task includes entity analysis (Entity Analysis, identifying specific entities in the text, such as names, place names, etc.), content classification (Content Classification, classifying according to the theme of the content), sentiment analysis (Sentiment Analysis, evaluating The emotional orientation of the text, such as positive, negative, etc.), Entity Sentiment Analysis (Entity Sentiment Analysis, assessing the emotional orientation of a specific entity in the text).
  2. Vision model : This type of model can perform some specific vision tasks, such as:
    • Classification: This category of tasks includes object detection (Object Detector, identifying specific objects in pictures).
    • Detection: This type of task includes Occupancy Analytics (analyzing the flow of people in a specific area), Person/Vehicle Detector (Person/Vehicle Detector, identifying people or vehicles in pictures), personal protective equipment detection (PPE Detector, identifying whether someone is wearing personal protective equipment in the picture), blurring the person (Person Blur, blurring the person in the picture).

These models are trained and optimized for specific tasks and can be used to solve some specific practical problems.

2.4 Generative AI Applications

image.png
Generative AI in text (generating writing, AI notes, sales copywriting, chatbots, email writing, etc.), code (code generation, code documentation, text to SQL, web application construction, etc.), pictures, pronunciation, video, 3D, etc. There are plenty of markets.

Bard code generation demo:
image.png

image.png
image.png

Bard Code Generation Capabilities:
image.png

Introduction to GenAI Studio:
image.png

The Generative AI App Builder helps you build AI apps without any coding.
image.png

PaLM API and MakerSuite can make generative development easier.
image.png

image.png


Creation is not easy. If this article is helpful to you, please like, bookmark and pay attention. Your support and encouragement are the biggest motivation for my creation.
insert image description here

Welcome to join my Knowledge Planet, Knowledge Planet ID: 15165241 to communicate and learn together.
https://t.zsxq.com/Z3bAiea is marked from CSDN when applying.

Guess you like

Origin blog.csdn.net/w605283073/article/details/130675917