An introduction to the artificial intelligence (AI) industry

Artificial Intelligence (AI)

It first started in 1950, when Alan Turing published " Computing Machines and Intelligence ".

Artificial Intelligence ( Artificial Intelligence), the English abbreviation is AI. It is a new technical science that studies and develops theories, methods, technologies and application systems for simulating, extending and expanding human intelligence. Artificial intelligence uses computers and machines to imitate the problem-solving and decision-making abilities of the human mind. AI is a very broad field that covers many subfields, such as machine learning, computer vision, speech recognition, natural language processing, etc.

There are three elements for the development of artificial intelligence: data, algorithm, and computing power (CPU, GPU, TPU). The CPU is mainly suitable for IO-intensive tasks, and the GPU is mainly suitable for computing-intensive tasks. Computing-intensive programs: The so-called computing-intensive programs mean that most of their running time is spent on register operations. The speed of the registers is equivalent to the speed of the processor. There is almost no delay in reading and writing data from the registers. You can make a comparison and read The latency of the memory is about a few hundred clock cycles. Not to mention the speed of reading the hard disk, even an SSD is very slow.

The technical categories covered by artificial intelligence, machine learning and deep learning are gradually decreasing. Artificial intelligence is the broadest concept. Machine learning is currently a relatively effective way to implement artificial intelligence. Deep learning is the most popular branch of machine learning algorithms. It has made significant progress in recent years and has replaced most traditional machine learning algorithms.

data science

Data science is a multi-disciplinary comprehensive discipline, including data acquisition, data analysis, data management, machine learning, statistical optimization and data visualization. It has gradually become a method to explore the origin of big data sets and convert big data into executable Smart and effective method.

Data mining is not a new thing, it was proposed many years ago. As the field of artificial intelligence has attracted attention in recent years, data mining has also begun to be mentioned. Data mining refers to the process of searching for information hidden in large amounts of data through algorithms. Data mining is often associated with computer science and achieves the above goals through many methods such as statistics, online analytical processing, intelligence retrieval, machine learning, expert systems (relying on past rules of thumb) and pattern recognition.

Natural Language ProcessingNLP

NLP is the abbreviation of natural language processing, which refers to a technology that uses computer science and artificial intelligence technology to process and analyze human natural language . Modern NLP is a hybrid discipline that combines linguistics, computer science, and machine learning. It is designed to enable computers to understand, process and generate natural language used by humans. NLP can be applied to text classification, information extraction, machine translation, speech recognition, sentiment analysis, question answering systems, natural language generation and many other fields.

It works like this:

  1. Receive natural language, which evolved through natural human use and which we use every day to communicate.
  2. Translate natural language, usually through probability-based algorithms.
  3. Analyze natural language and output the results.

In short, this is a process of creating an algorithm.

machine learning

Classical or “non-deep” machine learning relies more on human intervention to learn. Human experts determine hierarchies of features to understand differences between data inputs, often requiring more structured data for learning.

Model parameters

Model parameters are configuration variables internal to the model whose values ​​can be estimated from the data. Parameters are the key to machine learning algorithms. They are part of a model learned from historical training data.

deep learning

Deep learning can be thought of as " scalable machine learning . "

The "depth" in deep learning refers to a neural network composed of more than three layers, including input and output, and can be considered a deep learning algorithm.

The difference between deep learning and machine learning is how each algorithm learns. Deep learning can automate much of the feature extraction process, eliminating some required manual intervention and enabling the use of larger data sets.

"Deep" machine learning can use labeled data sets, also known as supervised learning, to determine the algorithm, but it does not necessarily have to use labeled data sets. It can capture unstructured data in raw formats (e.g., text, images) and automatically determine a hierarchy of features that distinguish different categories of data.

Framework Tensorflow

TensorFlow is one of the deep learning frameworks developed by the Google team. It is an open source software designed entirely based on the Python language. The original intention of TensorFlow is to implement the concepts of machine learning and deep learning in the simplest way. It combines the optimization technology of computational algebra to enable it to calculate many mathematical expressions.

Neural Networks

Neural network is the basis of deep learning. It is a network composed of multiple neurons. Each neuron receives inputs from other neurons and converts these inputs into outputs through an activation function. Neural networks can be trained using the backpropagation algorithm, which optimizes the network's weights and biases so that it can better fit the data.

Neural networks, also known as artificial neural networks (ANN) or simulated neural networks (SNN), are a subset of machine learning and the core of deep learning algorithms. Its name and structure are inspired by the human brain, mimicking the way biological neurons transmit signals to each other.

An artificial neural network (ANN) consists of layers of nodes, including an input layer, one or more hidden layers, and an output layer. Each node is also called an artificial neuron, and they are connected to another node with associated weights and thresholds. If the output of any single node is above a specified threshold, then that node is activated and the data is sent to the next layer of the network. Otherwise, the data will not be passed to the next layer of the network.

Neural networks rely on training data to learn and improve their accuracy over time. However, when tuned for accuracy, these learning algorithms become powerful tools in computer science and artificial intelligence, allowing us to quickly classify and group data. Tasks in speech recognition or image recognition may take just minutes to complete, whereas manual recognition by human experts may take hours. One of the most famous neural networks is Google's search algorithm .

large language model

A large language model ( Large Language Model, abbreviated LLM), also known as a large language model, is an artificial intelligence model designed to understand and generate human language. They are trained on large amounts of text data and can perform a wide range of tasks, including text summarization, translation, sentiment analysis, and more. LLMs are characterized by their large scale, containing billions of parameters, helping them learn complex patterns in linguistic data. These models are often based on deep learning architectures such as transformers, which helps them achieve impressive performance on a variety of NLP tasks.

LLM is a neural network trained on large amounts of text data. The training process allows the model to learn patterns in text, including grammar, syntax, and word associations. These models use these learned patterns to generate human-like text, making them ideal for natural language processing (NLP) tasks.

For GPT(Generative Pre-trained Transformer)example, GPT has actually appeared for several generations. GPT 3 has 45 TBtraining data of , so the data in the entire Wikipedia is only equivalent to 0. 6% of its training data. During this training, we call this corpus, which is language material. The amount of this corpus can be said to contain the essence of all our human language civilizations. This is a very, very large database.

After such a large amount of learning, some computer scientists who are doing AI did not expect that there would be such a change and could not reasonably explain the occurrence of this phenomenon. That is, when the amount of data exceeds a certain critical At this point, the model achieved significant performance improvements and developed capabilities that did not exist in small models, such as in-context learning. This has caused two problems: major AI giants have increased the amount of training parameters in order to achieve better results, and AI security considerations have arisen due to unexplained qualitative changes.

Emerging capabilities of large language models

  • Contextual learning. GPT-3 formally introduces contextual learning capabilities: assuming that the language model has been provided with natural language instructions and multiple task descriptions, it can generate the expected output of the test instance by completing the word sequence of the input text without additional training or gradient updates. .
  • Instructions followed. By fine-tuning a mixture of multi-task datasets formatted with natural language descriptions (i.e. instructions), LLM performs well on tiny tasks that are also described in the form of instructions. In this capacity, instruction tuning enables LLM to perform new tasks by understanding task instructions without using explicit samples, which can greatly improve generalization capabilities.
  • Step-by-step reasoning. For small language models, it is often difficult to solve complex tasks involving multiple reasoning steps, such as word problems in mathematical subjects. At the same time, through the thought chain reasoning strategy, LLM can solve such tasks to arrive at the final answer by utilizing the prompt mechanism involving intermediate reasoning steps. Presumably, this ability may be acquired through coding training.

AutoML

It’s automated machine learning, auto machine learning

Reference study materials

Guess you like

Origin blog.csdn.net/qq_29334605/article/details/130264291