Silicon Valley geeks broke the news about the parameters and design of GPT-4, while complaining about OpenAI and Musk

GPT-4 is the latest natural language generation model released by OpenAI. It is another breakthrough after GPT-3. It has an astonishing 1.76T parameters and can generate a variety of text content. However, OpenAI did not disclose too many details about the specific parameters and architecture design of GPT-4, only saying that it is a transformer-based model that uses a large amount of data for training.

GPT-4 is the latest natural language generation model released by OpenAI

Recently, George Hotz, a well-known Silicon Valley geek, revealed some inside information about GPT-4 in a podcast. He said that GPT-4 is actually made by connecting 8 identical 220B models, but the training data is different. , 8 expert model mixture expert models, each reasoning needs to do 16 rounds of reasoning. He believes that this design is not very elegant, and it will lead to the illusion of the content generated by the model or the collapse of the repeated output content.

George Hotz not only has a very detailed understanding of GPT-4, he also has his own unique and sharp views on OpenAI and Musk. He both admires and disdains OpenAI, and he admits that OpenAI is an absolute leader in the field of deep learning, with top engineering skills and theorists. But he also doesn't like OpenAI's superb engineering skills, and thinks this is a painful lesson. He said that OpenAI did a lot of unnecessary things, such as implementing transformers in its own language JAX. He believes that the mystery of the good effect of the transformer is not the attention mechanism but its semi-weight sharing. Because the weight matrix is ​​generated dynamically, you can compress the weight matrix.

George Hotz had a relationship with Musk. He was once invited by Musk on Twitter to work at Tesla and got a verbal offer. But in the end he didn't succeed, but chose to start his own business to do autonomous driving. He said that Musk and himself have different disciplinary backgrounds. Musk is physics, and he is information theory. He said that Musk wants to go to Mars, and he wants to make AI robots. He said his roadmap is for the first company to build the hardware infrastructure, the second TinyCorp to build the software infrastructure, and the third company to be the first to build the real product. That product is AI Girlfriend.

In addition to his views on GPT-4, OpenAI and Musk, George Hotz also shared his three conjectures on the development trend of AI:

**First guess: **AI computing power accelerates by six orders of magnitude every ten years

**Second Conjecture:** The error rate of AI's comprehensive capabilities (perception/decision-making/generation) drops by an order of magnitude every ten years

**Third conjecture:** Every time the AI ​​error rate drops by an order of magnitude (plus the emergence of new capabilities), the scope of application and field (market size) rises by an order of magnitude

He used some charts and data to support his conjecture. He believes that the development speed of AI is amazing, but it is not infinite. There are still many challenges and difficulties to be overcome. He said that reducing the cross entropy loss machine learning loss function is actually very difficult every step forward, and the computing power consumed is exponentially increasing. It is really a long way to go, and I will search up and down.

This podcast allows us to see the way of thinking and values ​​​​of a Silicon Valley geek. Although some views may be controversial, it is also an inspiration. We think George Hotz is an interesting and talented person, and some of his projects and ideas are very interesting, such as tinygrad and tinybox. We are also looking forward to his next move and hope he can realize his dream.

Guess you like

Origin blog.csdn.net/virone/article/details/131374888