Artificial intelligence is a very simple thing, why is it so mysterious by people?

(1)

Many people think that the large number of parameters in a large model is due to the large amount of data.

In fact, the parameter amount of the large model is related to the configuration of hyperparameters. The main hyperparameters are as follows:

  • Vocabulary size: Vocab_Size

  • Maximum position encoding size: Max_Position_embeddings

  • Hidden layer node size: Hidden_Size

  • Number of hidden layers: Num_Hidden_Layer

  • Number of attention heads: Num_Attention_Heads

These are in each model's config file (usually Config.json)

Many people don't even know the principle of the big model. They don't read the paper or the source code, so they just force the big model and fantasize about it.

(2)

Many people think that large models are smart because of the large amount of data.

In fact, a large amount of data does not determine how smart a large model is.

The value of the massive and high-quality data generated by humans (such as articles written by humans, photos taken by humans, and videos designed by humans) is to allow the model to automatically obtain structural feature labels through maximum probability statistics, eliminating the need for We human beings manually mark the features of the data, and this is the value.

Think about another thing with the same purpose: when we engage in enterprise informatization, doesn’t it mean that business activities happen in reality, and the input staff recalls memories and thinks in their minds, and then follows the structured discrete fields on the screen one by one A single entry characterizes the continuous business activities that take place in the physical world. In essence, when the inputter enters the fields, he is actually manually labeling the characteristics of the data.

(3)

Many people think that the reason why the big model is smart is because it imitates the structure of the human brain. This is a bit like the so-called Chinese people who love snakes and scorpions: complementing the shape with the shape.

In fact, the structure of the large model and the structure of the human brain, the structure of the neural network used by the large model and the structure of the neural network in the human brain are fundamentally different, but the same name is used, and the structure has no similarity at all.

Many people feel amazing that ChatGPT can understand and answer like us ordinary people. In fact, what is so magical about it, isn’t it because ChatGPT is a large number of artificial experts who have carried out four Post PreTrain Fine-Tuning, Prompt-Tuning, Instruction-Tuning, RLHF, of course it looks like a human, because it is manually tuned Well.

(4)

Many people are amazed by the so-called acupuncture, acupuncture, and meridians practiced by Chinese snakes and scorpions. In fact, what's so amazing about it.

The essence of the nervous system is the personal information highway, using bioelectricity to transmit information, the principle is the same as the optical fiber used in our home.

Your so-called slamming acupoints or acupuncture is nothing more than blocking the nerve capillaries, so the information cannot go down and up smoothly, so the brain command and trunk feedback are separated.

(5)

Many people also find it magical to dream. In fact, what's so amazing about it.

The forehead of the brain is essentially a very fast-processing but small-capacity memory. You receive information from the outside world every day, and you have to clear it at night and transfer it from hot storage to cold storage.

Some information you receive every day, so you don’t transfer it repeatedly, but add weight to the information you use every day, which is the same as the Page Rank of search engines.

Some information is new, and the brain will measure the similarity between the new information and the old information, automatically classify or cluster according to the measurement, and then link the old and new information, which is also very similar to a search engine or a fully connected network .

If some information is too new and unfamiliar to you, your brain can't connect with the information you already have, and your brain will start to malfunction at this time. Either your brain is too dumb and it throws away the information, so you experience something new and forget, that's what it is. Or if your brain is not too stupid, it starts to fabricate things and make hard connections based on the similarity of information. At this time, it is your dream. In the dream, you will feel that the scene is both familiar and strange, which is the essence. So now everyone sees big models made up indiscriminately, the principle is the same as dreaming.

(6)

Many people are amazed that it is the era of the sci-fi movie Terminator. Moreover, some concepts of snakes, snakes, scorpions and scorpions defined by human imagination are also transferred to the big model, such as: understanding, logic, reasoning, epiphany, emergence... .

What kind of brain is this.

I once read a paragraph on Zhihu, and I copied it. The principle is actually very simple:

Memory is a first-order associative connection from raw data to represented data.

Inference rules and inference methods themselves are associative connections within second-order memory

The density of the second-order links in the small-scale model is sparse, and the specific large-scale modeling can exceed 50% in the second-order links, forming a connected path, which seems to have the reasoning ability.

The so-called logic and principles in the past were seemingly self-evident assumptions given by people through prior knowledge, but in LLM, this part can be generated, of course, it requires the correct training method. This challenges the inductive and deductive methods that humans have believed to be unshakable for hundreds of years. Now it seems that the inductive and deductive rules are not real principles. These are actually explainable and constructible.

To sum up, the past training and model scale have resulted in sparse high-order connections, and after GPT3.5, the density of high-order correlations has reached the boundary of global connectivity. So GPT gives people the feeling that they can reason logically and have long-distance conversations. This is just a representation. It proves from the side that the logic, axioms, assumptions, truths, and meanings that humans have worshiped for thousands of years are actually at the language level, but metaphysics.

Therefore, if the essence of the principle is explained thoroughly, many things are originally very simple. But if you don't break the casserole and ask the end, you always want to swallow dates whole, then it is easy for snakes, snakes and scorpions to become deified and worshipped.

82cc6e468a520a2431074ba19182f475.jpeg

Guess you like

Origin blog.csdn.net/david_lv/article/details/130776046