2 Approaching chatGPT Toward AGI The main steps in the development process of machine learning chatGPT The steps to deal with after receiving a problem

Goal: Towards AGI

We don’t want to write code anymore, we want the machine to be able to hear, see, touch, smell, understand (input), do, speak, draw, and express on its own (output), and adapt to new things to complete complex tasks. Human intervention is no longer needed, this is AGI (Artificial General Intelligence).

These inputs (multimodal) are data. The computer learns from the data (extracts features) and connects these features to form a network ( neural network ). When there is a task, it is like forming paths in the mind. When encountering These paths will help us react quickly when faced with similar situations. These paths are formed through continuous learning and adjusting weights (fine-tuning).

Traditional programming approach : requires someone to explicitly write rules to tell you how to distinguish cats from dogs (hard to cover all situations).
Machine learning (simulating the human brain) : Provide the computer with a large number of pictures of cats and dogs, and each picture is labeled whether it is a cat or a dog ( supervised learning ). By analyzing this data, the computer "learns" on its own how to distinguish between cats and dogs. Or, give the computer a large amount of unlabeled data and let the computer find the patterns or structures in the data on its own ( unsupervised learning ). Training models through data enables computers to solve complex problems rather than relying on rules preset by humans.

Get closer to chatGPT and see the implemented technology step by step

Neural network (can learn, learn features and fine-tune parameter weights): can recognize images, understand language, or play games and other tasks.

The computer extracts features from the data and adapts the internal connections to fit these patterns. This makes it ideal for handling complex and varied data, as we encounter in our daily lives.

Why can neural networks learn? This is important.

Through training, connections (weights) can be adjusted to fit the input data. Like when we learn to ride a bicycle: initially, we may fall, but by constantly trying and adjusting our balance (weights), we eventually learn to ride.

What the neural network mainly learns through the learning process is the fine-tuning of features and parameter weights.

  1. Feature learning : In neural networks, feature learning is a process that occurs automatically. Neural network can extract useful features from raw data through its multi-layer structure. For example, in an image recognition task, the lower layers of a network might learn basic features like edges or colors, while higher-level networks might learn more complex features like shapes or parts of specific objects. These features are usually not designed by humans, but are automatically learned by the network through training data.

  2. Parameter weight fine-tuning : Each neuron in the neural network has a corresponding weight and bias, which are continuously adjusted during the training process. The weight determines the importance of the input signal, while the bias provides an additional adjustment space to help the neuron better fit the training data. Through optimization techniques such as backpropagation and gradient descent, the network gradually adjusts these parameters to minimize the difference between predictions and actual results. This process is the fine-tuning of weights.

A simple analogy: the learning process of a neural network can be compared to the cooking process. The raw data is like the ingredients, and the neural network is the chef. Through continuous trial and adjustment (learning and weight adjustment), it finds the best recipe (model parameters), making the final dish (prediction result) as delicious (accurate) as possible . Feature learning is like identifying which ingredients (data features) go best together, while weight fine-tuning is adjusting the ratio of ingredients and cooking methods to achieve the best taste.

The main steps of chatGPT development process:

  1. Requirements analysis and planning : Determine the goals and functionality of the model to be developed. This includes an in-depth understanding and analysis of target users, application scenarios, and desired functionality.

  2. Data collection : Collect large amounts of text data, which will be used to train language models. Data can come from a wide range of sources, including books, websites, forum posts, news articles, etc.

  3. Data preprocessing : Cleaning and processing the collected data. This step is very important as it involves removing irrelevant or low-quality content, standardizing text formats, handling special characters, etc.

  4. Model design : Select or design a suitable neural network architecture. This may include deciding to use a specific type of network (such as a Transformer), as well as configuring the network's size, number of layers, parameters, etc.

  5. Pre-training : Use the collected data to pre-train the model. This stage usually involves a large amount of computing resources, and the model will learn the basic laws and patterns of the language on a large amount of text.

  6. Fine-tuning and optimization : Fine-tune the model to suit specific tasks or application scenarios. This may include further training the model on specific types of data, or adjusting the model's parameters to optimize performance.

  7. Testing and evaluation : Test the model to evaluate its performance. This includes examining how the model responds to different types of inputs, as well as evaluating the model's accuracy, consistency, and response time on specific tasks.

  8. Integration and deployment : Integrate the trained model into an application or service and deploy it. This may involve integrating with existing systems, providing API interfaces, etc.

  9. Monitoring and maintenance : Ongoing monitoring and maintenance after model deployment. This includes tracking the model's performance and making necessary updates and optimizations based on user feedback and usage.

  10. Continuous iteration : Continuously iterate and update the model based on new data, technological advancements, and changes in user needs.

In short, developing a ChatGPT type model is a complex process, involving multiple steps from demand analysis to model deployment. Each step is critical and together ensures the performance and usability of the final model.

Processing steps after chatGPT receives a question:

  1. Input parsing : First, ChatGPT receives and parses user input. This stage involves understanding the input text, including language, sentence structure, and meaning. If the input is a specific command or request (such as a request to find information, generate an image, etc.), ChatGPT will also identify these specific needs.

  2. Contextual understanding : ChatGPT takes into account contextual information relevant to the current conversation. This includes the content of the previous conversation, the user's profile (if provided), and any session-specific settings or instructions. This stage is to ensure that responses are coherent and fit within the historical context of the conversation.

  3. Information processing and decision-making : At this stage, ChatGPT determines the best response strategy based on the input and context. This may include extracting information from an internal knowledge base, performing certain tasks (such as running a Python script or generating an image), or combining multiple sources of information to form an answer.

  4. Generate response : After determining the response strategy, ChatGPT will generate a response. This process involves natural language generation (NLG), which uses machine learning models to construct sentences. At this stage, the model considers how to express the required information or perform the task in a clear, accurate, and natural way.

  5. Output formatting and delivery : The generated answers will be formatted into a form suitable for user reading and sent to the user. This step ensures that information is presented in a way that matches both user expectations and the current interaction platform (e.g. text chat, voice output, etc.).

In general, after receiving a question, ChatGPT will go through steps such as receiving and parsing, context understanding, information processing and decision-making, response generation, and output formatting to ensure that appropriate, relevant, and coherent answers are provided.

Only look at it from a technical point of view, ignore everything else, and continue in the next article

BLAS (Basic Linear Algebra Subprograms) is a set of standard low-level programs used to perform common linear algebra operations, such as vector addition, vector and matrix multiplication, matrix multiplication, etc. BLAS is mainly used to improve the efficiency of these operations, especially in large-scale calculations.

BLAS is very important in the fields of artificial intelligence and machine learning, because these fields often require processing a large number of linear algebra operations. For example, when training a neural network, a large number of matrix operations are involved, and BLAS can help speed up these operations.

The role of BLAS can be understood this way: suppose you have a lot of Lego bricks that need to be assembled into different structures in a specific way. It would be very time consuming if you were starting from scratch each time to find the right building blocks and assembly methods. BLAS is like a set of predefined, efficient assembly guides that can quickly find the required building blocks (elements of linear algebra operations) and assemble them (perform linear algebra operations) in the most efficient way. This greatly improves the efficiency of building complex structures (completing complex computing tasks).

In summary, BLAS is an important tool for optimizing and performing linear algebra operations, and it plays a key role in artificial intelligence and machine learning.

Machine learning is when computer programs can learn from data, and neural networks are a type of machine learning algorithm. It can be understood with a simple analogy: assuming machine learning is a restaurant, then neural networks are like a special dish in the restaurant.

  1. Machine Learning (Restaurant) : Machine learning is an important branch of the field of artificial intelligence that involves the use of algorithms and statistical models to enable computer systems to learn and make decisions based on data. Just like a restaurant offers a variety of different dishes, machine learning offers a variety of different algorithms and techniques to solve different types of problems.

  2. Neural network (specialty) : Neural network is an algorithm in machine learning, inspired by the neuron network structure of the human brain. They are well suited to processing large amounts of data and excel in areas such as image recognition, speech recognition, natural language processing, and more. Neural networks are like a specialty at a restaurant, not the only option but popular for their unique flavor and ability to handle complex problems.

In short, neural networks are an important method in the field of machine learning, but they are not all of machine learning. There are many other algorithms and techniques, such as decision trees, support vector machines, etc., that also fall into the category of machine learning.

In addition to machine learning, the field of artificial intelligence (AI) also includes many other important branches.

These branches can be compared to different types of trees in a technological forest, each with its own unique characteristics and uses. The main branches include:

  1. Knowledge representation and reasoning : This is one of the traditional core areas of AI and involves understanding and representing knowledge of the external world and how to use this knowledge to perform effective reasoning. Imagine a tree, with its trunk representing a knowledge base and its branches representing reasoning mechanisms capable of deriving new conclusions from known information.

  2. Natural Language Processing (NLP) : NLP focuses on how to let computers understand, interpret and generate human language. It is like a tree that can understand and imitate the way humans communicate, with leaves that capture and reflect the complexity and subtlety of human language.

  3. Computer Vision : This field is dedicated to allowing machines to "see" the visual world and recognize and process image and video data. It's like a tree with visual perception, able to recognize and parse everything in its field of vision.

  4. Robotics : Robotics integrates multiple AI fields such as perception, decision-making, and action execution to create machines that can work autonomously or semi-autonomously. This is similar to a tree that is able to move and interact with its environment.

  5. Expert system : Expert systems imitate the decision-making capabilities of human experts and provide solutions to problems in specific fields. It's like a tree whose trunk and branches are tightly woven to form a network of rich expertise.

  6. Perception system : AI applications involving perception modes such as sound and touch can be compared to a tree that is very sensitive to environmental changes and can capture information from multiple sensory inputs.

  7. Evolutionary computation : Use the principles of natural selection (such as genetic algorithms) to solve optimization problems. It's like a tree constantly adapting to its environment and evolving.

Each branch has its unique research fields and application scenarios, which together constitute a rich and colorful field of artificial intelligence.

Guess you like

Origin blog.csdn.net/chenhao0568/article/details/135291472