Heavy! ChatGPT-4 is officially released, and the multi-modal function is here!

Today, OpenAI released its latest artificial intelligence language model, GPT-4, which is a milestone technological advancement. GPT-4 can not only generate text similar to human language, but also accept images and text as input, and output text. This means that GPT-4 can process multiple types of information and communicate with humans in a more natural and fluid manner.

GPT-4 is developed based on GPT-3.5, which has been used to create ChatGPT, a popular online chatbot. Compared with GPT-3.5, GPT-4 has a larger model size and more training data, which improves the quality and diversity of its generated text. According to OpenAI, GPT-4 has demonstrated "human-level" performance on benchmarks in various professional and academic domains. For example, it scores in the top 10% on mock bar exams; on medical diagnosis, it's on par with experienced doctors; and on creative writing, it generates compelling stories, poems, and lyrics. At the same time, GPT4 has also achieved excellent results in various tests in the United States. For example, in the SAT test, the score of GPT4 is 1410 points, which is higher than 88% of the test takers; in the GRE test, the score of GPT4 is 332 points; On the bar exam, GPT4 scores rank in the top 10% or so of test takers; on the AP Calculus exam, GPT4 also passes the test. These results show that GPT4 has strong abilities in language understanding, logical reasoning, and mathematical calculations.

(Comparison of scores between GPT4 and the previous generation GPT3.5 in different exams)

In addition to text input, GPT-4 also has the ability to process image input, which parallels plain text, letting users specify any visual or language task. Specifically, it can generate textual output (natural language, code, etc.) given an input consisting of interspersed text and images. For example, a user could show GPT-4 an image and have it describe or explain what's in the image. This is a very useful feature for the visually impaired or users who want more information. Currently, OpenAI is working with Be My Eyes to test this feature. Be My Eyes is an upcoming smartphone app that leverages GPT-4 to provide users with real-time image description services.

Currently, GPT-4 is only available to ChatGPT Plus subscribers. OpenAI stated that they plan to gradually expand the availability of GPT-4 in the future and explore its application possibilities in other fields and scenarios.

However, OpenAI did not announce technical details such as the parameter scale, hardware scale, and training time of GPT-4 as before. This has led to a lack of transparency and trust in the real capabilities and potential risks of GPT-4. Second, GPT-4 still has some technical limitations and flaws. Although OpenAI claims to have conducted sufficient safety testing and ethical review on GPT-4, GPT-4 may still generate biased, erroneous, or hateful text.

 

おすすめ

転載: blog.csdn.net/m0_72843152/article/details/129559176