From GPU to ChatGPT, this article takes you to sort out the inextricable connections between GPU/CPU/AI/NLP/GPT [recommended collection]

Table of contents

hardware

GPU

What are GPUs?

How do GPUs work?

The Difference Between GPUs and CPUs

GPU manufacturers

Overseas leading GPU manufacturers:

Domestic GPU manufacturers:

nvidia's product matrix

AI

What is Artificial Intelligence (AI)?

Artificial Intelligence Segmentation

Machine Learning: Study how to use algorithms and models to allow computers to learn and extract rules from data to complete specific tasks.

Deep Learning: A type of machine learning that uses multi-layer neural networks to learn features and patterns to automate complex tasks.

Natural Language Processing (NLP): Research methods and technologies on how to enable computers to understand, analyze, and process human language.

Computer Vision: Research on how to make computers "understand" images and videos, and extract useful information and features from them.

Robotics: The study of how to design, build, and control robots so that they can perform specific tasks.

Reinforcement Learning: It is a machine learning method that learns the optimal action strategy through interaction and feedback with the environment.

Knowledge Graph: It is a method of organizing, representing and reasoning knowledge in the form of a graph, which is used to realize applications such as intelligent search and recommendation.

Speech Recognition (Speech Recognition): Research on how to enable computers to recognize and understand human speech, so as to realize voice input, voice control and other functions.

NLP

What are Transformers?

Implementation of the Transformer model

Are there any other models?

GPT model

ChatGPT

Compared with other existing similar products, the unique advantages of ChatGPT are:

GPT-3.5

The advantages of ChatGPT are:

GPT-4

AIGC model

Artificial intelligence breaks through Moore's Law

future

reference


Benefits : There is chat-gpt pure sharing at the end of the article, no magic, no limit

hardware

"Without hardware support, you crack a fart"

picture

GPU

picture

What are GPUs?

GPU is the abbreviation of Graphics Processing Unit, and the Chinese translation is graphics processor. GPU was originally designed to increase the speed of computer processing graphics, and is mainly responsible for the calculation and processing of images. Through parallel computing, GPU can perform multiple tasks at the same time, which greatly improves the speed and efficiency of graphics and data processing.

In recent years, due to its parallel computing characteristics, GPU has also been used in some fields that require a lot of computing, such as machine learning, deep learning, data mining, scientific computing, etc. In these areas, GPU can accelerate computationally intensive tasks such as training models and processing massive amounts of data, significantly improving computational efficiency and speed. Therefore, GPU has become an important part of modern computers and is widely used in various fields.

How do GPUs work?

The working principle of the GPU is similar to that of the CPU, and they all complete computing tasks by executing instructions. The difference is that the CPU completes computing tasks by executing instructions serially, while the GPU completes computing tasks by executing instructions in parallel. The parallel computing method of GPU can execute multiple tasks at the same time, which greatly improves the computing efficiency and speed.

You can refer to this video to understand how the GPU works: https://www.bilibili.com/video/BV1VW411i7ah/?spm_id_from=333.337.search-card.all.click&vd_source=6fb7f58b736bb5913c33073b42979450

The Difference Between GPUs and CPUs

The difference between GPU and CPU is mainly reflected in the following aspects:

  1. Architectural design is different: CPUs are designed with a focus on single-threaded processing capabilities, usually with fewer computing cores and more cache. GPUs are designed for parallel processing and usually have a large number of computing cores but a small cache.

  2. Different calculation methods: When the CPU processes tasks, it mainly performs calculations by executing instruction streams. The GPU is to improve computing efficiency by executing a large number of threads and performing parallel computing at the same time. The parallel computing capability of GPU can handle many similar tasks at the same time, which is suitable for large-scale computing-intensive tasks, such as image processing, machine learning, etc.

  3. Different uses: The CPU is mainly used for general computing tasks, such as file processing, operating system operation, programming, etc. GPUs are mainly used for graphics processing, games, and computationally intensive tasks such as machine learning, deep learning, etc.

To sum up, both GPU and CPU have their own advantages and applicable scenarios, and they usually cooperate with each other. For example, in machine learning, the CPU is usually used for data preprocessing and model training, while the GPU is used for model calculation and inference.

Is the graphics card we often say GPU?

Yes, what we usually call a graphics card (Graphics Card) is a device with a GPU installed. In addition to the GPU, the graphics card also includes components such as video memory, heat sink, and graphics card BIOS. The graphics card controls the display to output images by converting the data transmitted by the CPU into image signals.

In some application scenarios that require a lot of image processing or calculation, GPU can complete the task more efficiently than CPU. Therefore, modern graphics cards are also widely used in accelerated computing in fields such as machine learning and deep learning, and are even used in scientific computing, astronomy, geology, meteorology and other fields.

Regarding graphics cards, you may have heard of "integrated graphics card" and "discrete graphics card". In fact, integrated and independent graphics cards usually refer to different management methods of video memory. They have the following differences:

  1. Integrated graphics: Integrated graphics usually refers to a graphics card that integrates video memory inside the motherboard chipset or processor. This kind of graphics card usually has poor performance and is suitable for some simple application scenarios, such as daily office work and web browsing.

  2. Discrete graphics card: Discrete graphics card usually means that the video memory is independent of the motherboard chipset or processor, and has its own video memory and video memory controller. This kind of graphics card has more powerful performance and is suitable for application scenarios that require a large amount of video memory and computing performance, such as games, graphics processing, and scientific computing.

  3. Shared video memory: Shared video memory usually refers to the shared use of video memory and system memory, that is, a part of system memory is divided into video memory. This method is suitable for some light graphics processing application scenarios, such as movie playback, web browsing, etc.

In general, integrated graphics cards usually have poor performance and are suitable for simple application scenarios. Discrete graphics cards have more powerful performance and are suitable for application scenarios that require a large amount of video memory and computing performance. Shared video memory is a compromise solution that is suitable for Some light graphics processing application scenarios.

GPU manufacturers

Overseas leading GPU manufacturers:

  1. Nvidia: Nvidia is currently one of the largest GPU manufacturers in the world. Nvidia mainly produces GPU products for different fields such as gamers, data centers and professional users.

  2. AMD: One of the world's leading GPU manufacturers. AMD mainly produces GPU products used in different fields such as personal computers, workstations and servers.

  3. Intel: It is also starting to enter the GPU market. Intel mainly produces GPU products used in different fields such as personal computers, workstations and servers.

Domestic GPU manufacturers:

Haiguang Information, Cambrian, Loongson Zhongke, Jingjiawei, etc.

Is the chip "stuck" referring to the GPU?

Yes, but not all.

"Chip stuck neck" refers to the phenomenon of global semiconductor shortage, also known as "chip shortage" or "semiconductor shortage", which refers to the situation of insufficient global semiconductor supply caused by the new crown epidemic and other factors since 2020. This supply shortage has affected several industries, including automobiles, electronics, communications equipment and more. China, one of the largest semiconductor markets in the world, has also been affected by this supply shortage.

my country's independent research and development and manufacturing level in the semiconductor field is relatively low, relying on imported chips to support its economic and industrial development. Affected by the global chip shortage, some key industries in my country, especially the automotive, electronics and communications industries, have experienced supply shortages and price increases, which have had a certain impact on its economy. In response to this situation, the government has stepped up support for the semiconductor industry, encouraging local companies to increase chip R&D and production capacity to reduce dependence on imported chips.

Specifically related to GPU: On August 31, 2022, in order to comply with the requirements of the US government, sales of high-end GPUs from Nvidia and AMD will be suspended in China, including Nvidia’s A100, H100 and AMD’s MI100 and MI200 chips

Nvidia officially confirmed the matter in the SEC document, saying that it received a notification from the US government on August 26.

SEC documents are financial statements or other official documents submitted to the US Securities and Exchange Commission (SEC) by listed companies, insiders of listed companies, and securities firms.

picture

nvidia (NVIDIA)

picture

According to the market research report for the fourth quarter of 2021, Nvidia has a 51.2% share of the global discrete graphics card market, ranking first, surpassing the market share of its competitor AMD. In the global GPU market (including discrete graphics and integrated graphics), Nvidia's market share is 18.8%, ranking second, second only to Intel's market share.

nvidia's product matrix

  1. GeForce series: mainly for the consumer market, including desktop graphics cards and notebook computer graphics cards, etc., with high-performance games and multimedia applications as the main application scenarios.

  2. Quadro series: mainly for the professional workstation market, including film and television production, architectural design, scientific computing, medical imaging and other fields, with high performance, high stability and excellent graphics rendering capabilities.

  3. Tesla series: mainly for the high-performance computing market, including scientific computing, deep learning, artificial intelligence and other fields, with extremely high computing performance and data throughput, and supports multi-GPU cluster computing.

  4. Tegra series: mainly for mobile and embedded markets, including smartphones, tablet computers, automobiles, drones and other fields, featuring high performance, low power consumption, and small size.

  5. Jetson series: mainly for the artificial intelligence application market, including robotics, autonomous driving, intelligent video analysis and other fields, featuring high performance, low power consumption, and small size.

picture

Maybe you don't know much about the above product series, models and nouns, and you don't have any concept, so let's establish a price concept first. Let's take the GPU A100, which is widely used in the field of artificial intelligence, as an example to see its price:

picture

It is because of this price that the A100 is also known as the "Nvidia Big Gold Brick".

Why talk about Nvidia alone? Because computing power is the "source of power" of artificial intelligence, GPU is the "main supplier" of computing power. Nvidia is the world's largest GPU manufacturer, and its GPU computing power is the strongest. For example, the A100 GPU computing power is 10.5 petaFLOPS, while AMD's MI100 GPU computing power is 7.5 petaFLOPS.

I do not understand the meaning? Peta is one of the units of measurement, it represents 10 to the 15th power. Thus, 1 petaFLOPS (PFLOPS) means that 10 of 15 floating-point operations can be performed per second. So, the A100 GPU has a computing power of 10.5 petaFLOPS, which means it can perform 10.5 trillion floating-point operations per second.

AI

What is Artificial Intelligence (AI)?

Artificial intelligence refers to a computer technology that enables computer systems to simulate human intelligence behavior through learning, reasoning, self-adaptation and self-correction methods to achieve a series of tasks similar to human intelligence. These tasks include speech recognition, natural language processing, image recognition, machine translation, autonomous driving, intelligent recommendation and gaming, etc. At the heart of artificial intelligence is machine learning, which involves training computer systems using vast amounts of data and algorithms to recognize patterns, make predictions and make decisions. Artificial intelligence also involves other fields, such as natural language processing, computer vision, robotics, knowledge representation and reasoning, etc. Artificial intelligence is widely used in various fields, such as medical care, finance, transportation, manufacturing, media, and gaming, etc., bringing higher efficiency and innovation to these fields.

Artificial Intelligence Segmentation

picture

There are many subfields in the field of artificial intelligence, some of the more common ones are listed below:

  1. Machine Learning: Study how to use algorithms and models to allow computers to learn and extract rules from data to complete specific tasks.
  2. Deep Learning: A type of machine learning that uses multi-layer neural networks to learn features and patterns to automate complex tasks.
  3. Natural Language Processing (NLP): Research methods and technologies on how to enable computers to understand, analyze, and process human language.
  4. Computer Vision: Research on how to make computers "understand" images and videos, and extract useful information and features from them.
  5. Robotics: The study of how to design, build, and control robots so that they can perform specific tasks.
  6. Reinforcement Learning: It is a machine learning method that learns the optimal action strategy through interaction and feedback with the environment.
  7. Knowledge Graph: It is a method of organizing, representing and reasoning knowledge in the form of a graph, which is used to realize applications such as intelligent search and recommendation.
  8. Speech Recognition (Speech Recognition): Research on how to enable computers to recognize and understand human speech, so as to realize voice input, voice control and other functions.

Of course, these branch fields also cross and influence each other. For example, deep learning has applications in computer vision, natural language processing, and speech recognition; computer vision and natural language processing are often combined, such as image subtitle generation and image processing. Question and answer tasks. In addition, artificial intelligence also intersects with other fields such as control engineering, optimization, and cognitive science.

NLP

Let's take a specific look at the branch of natural language processing (NLP), which is an important branch of artificial intelligence and one of the most widely used artificial intelligence technologies in practical applications.

NLP (Natural Language Processing, Natural Language Processing) aims to enable computers to understand, parse, generate and manipulate human language.

NLP technology can be used in text classification, sentiment analysis, machine translation, question answering system, speech recognition, automatic summarization, information extraction and many other aspects. Implementing NLP technology usually requires the use of some basic machine learning algorithms, such as text preprocessing, word embedding (word embedding), word segmentation, part-of-speech tagging, named entity recognition, and so on. These algorithms can learn the structure and laws of language from a large number of corpora, and process and apply natural language through statistical analysis and machine learning models.

In recent years, with the development of deep learning technology, some new models based on deep learning have emerged in the NLP field, such as Transformer model and BERT model. These models can achieve excellent performance in multiple NLP tasks by using large-scale corpora for pre-training. At the same time, some new application areas have also emerged, such as dialogue systems, intelligent customer service, intelligent writing, intelligent question answering, etc.

What are Transformers?

We mentioned above that there will be intersections between the branches of artificial intelligence, and Transformer can be regarded as the intersection of deep learning and NLP.

picture

The Transformer model is a neural network model in deep learning, which is open sourced by Google.

The Transformer model was originally proposed in the paper "Attention Is All You Need" published in 2017, and was later added to deep learning frameworks such as TensorFlow, which is convenient for developers to use and expand. Currently, the Transformer model has become one of the most popular models in the field of natural language processing.

TensorFlow is an open source deep learning framework for implementing neural network models. Therefore, the Transformer model can be implemented using TensorFlow. In fact, the TensorFlow team has provided a library called "Tensor2Tensor" that contains an implementation of the Transformer model. In addition, many researchers and engineers also use TensorFlow to implement their own Transformer models and use them in various NLP tasks.

picture

Transformer is particularly good at processing sequence data, including natural language text data in the field of NLP. In the field of NLP, the Transformer model is widely used in various tasks, such as machine translation, text summarization, text classification, question answering system, language model and so on. Compared with traditional recurrent neural network (RNN)-based models, the Transformer model models long-range dependencies and relationships in sequences by using self-attention and multi-head attention, It effectively alleviates the problems of gradient disappearance and gradient explosion in the RNN model, thus achieving good performance on NLP tasks. Therefore, it can be said that Transformer is an important deep learning model in the field of NLP and an important part of modern NLP technology.

Implementation of the Transformer model

The Transformer model is just an abstract concept and algorithm framework, and the specific implementation needs to consider many details and techniques. In practical applications, it is necessary to carry out the process of model design, parameter adjustment and training according to specific tasks and data sets. In addition, it needs to be implemented and optimized using specific software frameworks (such as TensorFlow, PyTorch, etc.) to improve the efficiency and accuracy of the model.

To implement the Transformer model, you can use deep learning frameworks, such as TensorFlow, PyTorch, etc. In general, the steps to implement the Transformer model are as follows:

  1. Data preparation: prepare training and test data, including corpus data and label data, etc. Model architecture design: Determine the structure of the model, including the encoder and decoder parts of the Transformer, and the attention mechanism.

  2. Model training: Use the training data to train the model and tune the model to achieve better prediction results.

  3. Model evaluation: Use the test data to evaluate the model, including the calculation of loss function, precision, recall rate, F1 value, etc.

  4. Model deployment: Deploy the trained model to the production environment for practical application.

A popular implementation in the industry is to use a deep learning framework, such as TensorFlow or PyTorch, to conduct secondary development based on the existing Transformer model code to meet your own needs. At the same time, there are also some third-party Transformer libraries, such as Hugging Face's Transformers library, which can be used directly, which is convenient and quick.

Are there any other models?

There are many models similar to Transformer, some of the main models include:

  1. BERT (Bidirectional Encoder Representations from Transformers): BERT is a pre-trained language model launched by Google in 2018. It uses the encoder part of the Transformer model and uses a bidirectional Transformer model to model the input text.

  2. GPT (Generative Pre-trained Transformer): GPT is a pre-trained language model launched by OpenAI in 2018. It uses the decoder part of the Transformer model and is mainly used to generate text.

  3. XLNet: XLNet is a pre-trained language model proposed by researchers from CMU, Google and Carnegie Mellon University in 2019. It uses a combination of autoregressive Transformer model and autoregressive Transformer model, which has better generation performance and language Comprehension.

  4. T5 (Text-to-Text Transfer Transformer): T5 is a Transformer-based general text conversion model launched by Google in 2019, which can handle various NLP tasks, such as text classification, question answering, text summarization, etc.

  5. RoBERTa (Robustly Optimized BERT Pretraining Approach): RoBERTa is a pre-trained language model launched by Facebook in 2019. It improves the performance of various NLP tasks by optimizing the BERT training process.

These models are based on the Transformer architecture, and through different optimizations and improvements to improve performance and application range. The following picture is the family tree of the model:

picture

GPT model

In 2018, OpenAI launched GPT-1 (Generative Pre-training Transformers, creative pre-training transformation model) based on the Transformer structure, with a parameter volume of 117 million. GPT-1 surpassed Transformer and became the industry's first. From 2019 to 2020, OpenAI released GPT-2 and GPT-3 successively, with parameters reaching 1.5 billion and 175 billion respectively. Among them, GPT-3 directly uses human natural language as instructions during the training process, which significantly improves the performance of LLM. Performance in multiple language scenarios.

ChatGPT

picture

ChatGPT is a dialogue AI model developed by OpenAI in the United States. It is a natural language processing (NLP, Natural Language Processing) tool supported by artificial intelligence technology. It was officially released on November 30, 2022. It can learn and understand human language, and interact with human chats in combination with the context of the conversation. It can also write manuscripts, translate text, program, write video scripts, etc. As of the end of January 2023, ChatGPT has reached 100 million monthly active users, making it the app with the fastest growing active user scale in history

Compared with other existing similar products, the unique advantages of ChatGPT are:

  1. Based on the GPT-3.5 architecture, using a massive corpus to train the model, including conversations in real life, enables ChatGPT to be close to chatting with humans

  2. Apply the new technology RLHF (Reinforcement Learning with Human Feedback, reinforcement learning based on human feedback), so as to understand and follow human thinking, values ​​and needs more accurately

  3. Model training can be done in the same stage

  4. It has powerful computing power, self-learning ability and adaptability, and the pre-training is highly versatile

  5. Can carry out multiple rounds of dialogue in a row to improve user experience

  6. More independent and critical thinking, capable of questioning the rationality of user questions, acknowledging the limitations of one's own knowledge, listening to user opinions and improving answers.

GPT-3.5

The GPT-3.5 model used by ChatGPT is based on GPT-3 by adding Reinforcement Learning from Human Feedback (RLHF, human feedback reinforcement learning) technology and near-segment strategy optimization algorithm. Optimize the output results in three aspects to reduce the risk of harmful content such as racial discrimination and sex discrimination generated by the pre-trained model.

There are three main stages in ChatGPT training process.

picture

  1. The first step is to train the supervision strategy. Human annotators provide expected results for randomly selected prompts, fine-tune GPT-3.5 in the form of supervised learning, and generate a Supervised Fine-Tuning (SFT) model to enable GPT-3.5 to initially understand instructions. This step It is the same as the previous GPT-3 model training method, which is similar to the process in which the teacher provides students with marked answers.

  2. The second step is the reward model. In the SFT model, prompts are randomly selected and several results are generated. Human annotators sort the matching degree of the results, and then pair the questions and results into data to score and train the input reward model. This step It is similar to students writing their own answers in mock quiz, and the teacher will grade each answer.

  3. The third step is Proximal Policy Optimization (PPO, near segment policy optimization), which is also the most prominent upgrade of ChatGPT. Through the scoring mechanism of the second step, the model trains the data in the SFT model, automatically optimizes iterations, and improves the quality of ChatGPT output results. That is, students modify their answers according to the teacher's feedback to make the answer closer to the high score. standard.

The advantages of ChatGPT are:

  1. Use GPT-3 with 1750 trillion parameters to pre-train the underlying model, which is one of the largest language models in the world

  2. The computing power is supported by Microsoft, using tens of thousands of NVIDIA A100 GPUs for training, and the running speed of the model is guaranteed (from here we can see the importance of the hardware, the A100 "stuck neck" is really uncomfortable, but all factories have stockpiled it before. The product is ready, it should be able to meet the status quo in the short term, and the A800, which is the replacement of the A00, will be shipped soon, and the training efficiency will increase rapidly, which should also meet the demand.)

  3. The algorithm uses a reward model and a proximal optimization strategy for iterative optimization, aligns the output results with human expected answers, reduces harmful and discriminatory answers, makes ChatGPT more anthropomorphic, and makes users feel that the communication process is smoother.

GPT-4

picture

According to German media Heise, at an artificial intelligence-related event on March 9 local time, four Microsoft German employees introduced the large language model (LLM) including the GPT series on the spot. Official Andreas Braun said that GPT-4 will be released soon.

GPT-4 has been developed to basically "work with all languages": you can ask a question in German and get an answer in Italian. With multimodality, Microsoft and OpenAI will make "models comprehensive." Completely different possibilities will be offered, such as video.

AIGC model

In the field of artificial intelligence content generation, besides OpenAI, there are other players. Let’s take a look at the current situation of top players:

picture

Artificial intelligence breaks through Moore's Law

Moore's Law is a prediction made in 1965 by Gordon Moore, one of the founders of Intel Corporation. This forecast assumes that the number of transistors that can fit on an integrated circuit will double every 18 to 24 months, while cost remains the same or decreases.

Put simply, Moore's Law predicts that over time, the number of transistors that can be packed on a computer chip will increase exponentially, while the cost will continue to decrease. This means that computer performance will continue to increase on the same chip area, while the cost of computers will continue to decrease.

Moore's Law has played an important role in the computer industry in the past few decades. It is one of the important symbols of computer development. However, in recent years, as Moore's Law has reached its limit, some people have begun to doubt its sustainability.

The definition of Moore's Law can be summed up, and there are mainly three versions as follows:

  1. The number of transistors that can be accommodated on an integrated circuit doubles approximately every 18 months.

  2. Microprocessors double in performance or halve in price every 18 months.

  3. Computers bought for the same price double in performance every 18 months.

With the iteration of the model, the demand for computing power is also increasing:

picture

At present, the demand for computing power of artificial intelligence has broken through Moore's Law

future

At present, I have started using chatGPT in many scenarios such as programming, email writing, and knowledge learning. In the future, I plan to develop chatGPT applications so that more people can experience the charm of chatGPT.

The future has come, what is missing is not technology, but imagination!

Charger will bring you the latest and most comprehensive interpretation as soon as possible, don't forget the triple wave.  
                                                                                                        

 Pay attention to the official account: Resource Charging Bar
Reply: Chat GPT
Charging Jun sent you: Enjoy using the Chinese version for free
Click on the small card to follow, reply: IT

I have all the information I want 
 

Guess you like

Origin blog.csdn.net/CDB3399/article/details/132059308