Industry Report | Tsinghua University AIGC Development Research 1.0 Shockingly Released! (Technology + Future)

Text | BFT robot

 

01

Technology

The Evolutionary History of Deep Learning: The Change of Knowledge is Surge

Key steps that have taken place:

  • The Birth of Artificial Neural Networks

  • Proposal of the Back Propagation Algorithm

  • GPU usage

  • The emergence of big data

  • Pre-training and transfer learning

  • The Invention of Generative Adversarial Networks (GANs)

  • Successful Applications of Reinforcement Learning

  • A breakthrough in natural language processing

Key steps to take place:

  • Artificial General Intelligence (AGI) Full Dimensional Adaptation

  • Effective communication and collaboration among models Sharing and collaboration

  • Fusion and Symbiosis Human-Machine Symbiosis

  • Model explanatory transparent intelligence

  • Value Isomorphism, Ethics Ed.

  • Model Morality and Ethics

  • Environmental Compatibility Energy and Computational Efficiency

Deep learning models are expected to gradually evolve into new life forms with higher intelligence and autonomy

Evolutionary Tree of Large Language Models: Emergence of Transfer Learning Capabilities

Attentionis All You Need”: A landmark work in the light of enlightenment

ChatGPT: factuality passes the Turing test

Why chatGPT?

  1. non-linear innovation

  2. Mainstream deviation, marginal technological breakthrough

  3. Black Swan

  4. accidental innovation

After ChatGPT closed source, undisclosed possible secrets

  1. Emergence after the deluge of data, algorithms for enhanced learning

  2. Dimensionality expansion and increased neural network complexity optimize algorithms for self-supervised learning

  3. Enhanced optimization for human feedback

  4. Improve model interpretability

  5. New global algorithm thinking and implementation, multi-modal learning algorithm, more advanced generative confrontation network (GANs) algorithm

The development process of chatGPT software

ChatGPT1/2/3/4: Learn the new, understand the old, accept the subtleties

Parameter Expansion:  The number of parameter sizes grows exponentially

Pre-training-fine-tuning paradigm:  unlabeled text data pre-training, task-specific fine-tuning, task-specific learning, fine-grained control strategies

Transformer Architecture:  Efficient Parallel Computing and Long-Distance Dependency Capture

Autoregressive generative pre-training:  generating coherent, logical text, coherent generation

Model generalization ability:  NLP tasks show stronger generalization ability and cross-task adaptation

Zero-sample/few-sample learning: Effective learning and reduced data labeling costs

Multilingual Support:  Cross-language Knowledge Transfer and Application V

Open source and closed source:  ChatGPT caused huge controversy from open source to closed source

GPT5/6/7/8: Endless exploration and soaring wisdom

Product Rhythm: Grayscale Evolution is Steady and Hurricane

Reasoning process: association speculation and optimal output

Understanding input:  Distributed semantic analysis first receives text sequences and converts them into word vectors, also known as embeddings. The process is based on the distributed semantic assumption that the meaning of words is determined by their use in context.

Parameter association:  context focus chain, inputting these word vectors into Transformer's Encoder to generate contextual representation can be seen as looking for information related to the input in its internal parameter model, and can also be regarded as a chain reaction, because each word The contextual representations of all depend on the contextual representations of the preceding words.

Generate answer:  Generative probabilistic modeling, the model initializes the Decoder part of the Transformer, and inputs the output of the Encoder (that is, the context representation) and the current output sequence into the Decoder. Decoder will generate the probability distribution of the next word. The word with the highest probability or other set probability distribution is selected as output, and this word will be added to the output sequence.

Choose the most appropriate answer: dynamic word string evolution, repeat the above steps, and add new words to the output sequence each time until a complete output sequence is generated.

Although the process is called reasoning, the open-source version of ChatGPT does not engage in explicit logical reasoning, it cannot understand or derive complex facts. Because the open source version of ChatGPT does not have an explicit knowledge or reasoning engine, all knowledge is implicit in the model parameters.

ChatGPT defect: high-order reasoning positioning fog

reasoning threshold

Advanced reasoning skills are required, such as causal inference, disturbance variable analysis and counterfactual reasoning, etc.

positioning fog

It is necessary to accurately locate the root cause of the problem, and the location of more complex problems is still foggy

knowledge blind spot

There is a knowledge blind spot for knowledge involving professional secrets or the background of the entire large project

self-correcting resistance

If the probability of error in each answer exceeds the probability of correction, it is difficult for the error rate of the system to achieve effective self-correction.

Scalability challenges

For more complex problems, the correct rate drops exponentially

Prompt: inspire inspiration to generate wonderful

Enhancements to be done in the future

1. Perceptual tuning

Fine-tune the input and output for better results;

2. Cross-modal interoperability

be able to understand visual or audio prompts and be able to respond in the form of text, audio or images;

3. Dynamic learning

Enable it to learn and improve based on user feedback and its own experience;

4. Context awareness

better understand users' context and intent to generate more accurate and relevant answers;

5. Ethical transparency

Clearly state the ethical basis for their decisions to increase users' understanding and trust in their decisions;

Prompt with parameters: adjust parameters to optimize text accuracy

Reverse Tips: Multi-modal Learning Mind Words Painting the World

Reverse prompting is how to reverse generate prompt words with multimodal content, and then consider the prompt words for further automatic content generation. So backhinting represents the reverse process from world to text

graphic text

  • What is the main object in the picture?

  • When and where did the scene in the picture take place?

  • What emotion or motivation does the person or animal in the picture have?

video text

  • What is the main event in the video?

  • What are the relationships or conflicts between the people or characters in the video?

  • What is the function or significance of the pictures and sound effects in the video?

phonetic text

  • Who is the speaker in the voice?

  • What is the theme or purpose in the speech?

  • What is the tone or attitude in the voice?

Have initially possessed five abilities

Multimodal Fusion

Microscopic fine-grained characterization

Dynamic Sequence Encoding

Intermedia Semantic Mapping

explanatory meta-learning

In the future, it is necessary to enhance the high-level abstract understanding of the five abilities

dynamic situational awareness

Multimodal Collaborative Learning

Intuitive Reasoning Universal Semantic Generation

Emergence: New Knowledge Extraction Adaptive Change

The sudden appearance of a previously unprogrammed ability is a manifestation of emergence

Knowledge embedding:  abstract and enrich knowledge from a large amount of training text, including facts, concepts, theoretical views, etc., and embed this knowledge into the generated text when needed

Context adaptation:  Generate responses that are highly consistent with the context Diversity generation: Various themes, styles, emotions Innovative thinking: Imitate human beings, generate new ways of expressing opinions

Complex dialogue:  understand and respond to user questions, comments, and feedback Semantic deepening: understand and generate deep semantic texts, such as complex discourses and demonstrations Tendency regulation: generate text with a certain tendency Self-monitoring: avoid generating inappropriate or harmful content

case:

When a user asks: "On an isolated island, how will a unicorn and a phoenix get along?

This is a highly imaginative question, since both unicorns and phoenixes are mythical creatures. But because ChatGPT accepts a lot of text input as training data, it can learn about unicorns and phoenixes from it, and incorporate this knowledge in its answers.

ChatGPT might generate an answer like this: "On an isolated island, a unicorn and a phoenix might respect each other and coexist. The unicorn might search for food and water on the island, while the phoenix would soar through the skies, capturing the island. Other creatures on the planet may help each other, such as supporting each other in trouble. They may also become friends and guard this mysterious territory together.

In this answer, ChatGPT showed some creativity, providing a scene about unicorns and phoenixes getting along on an isolated island. This emergent creativity emerges through the model's learning of large amounts of text during training, rather than being explicitly programmed into it

Eight characteristics:  nonlinearity, multi-scale, self-organization, self-adaptation, hard-to-reduce, signal transmission, feedback loop, history dependence

AI Alignment: Robust Adaptation for Value Learning

AI alignment refers to ensuring that the goals and behaviors of AI systems are aligned with human values, goals, and wishes

Computing power: the cornerstone of the industry to create all things

Definition of computing power

Computing power is an indicator that measures the ability of a computing device or computing system to complete a specific task within a certain period of time. In computer science, computing power is often used to evaluate the performance of a processor, graphics processing unit (GPU), or other hardware component.

Core Technology

Advanced process technology, such as EUV (extreme ultraviolet) lithography, 3D chip packaging and other low-power processor technologies, such as ARM architecture; new storage technologies, such as MRAM, ReRAM.3D XPoint, etc.

Application Scenario

High Performance Computing (HPC), Artificial Intelligence and Machine Learning, Virtual Reality and Gaming, Big Data Analytics, Internet of Things (loT), Autonomous Driving and Robotics, Drug Discovery and Biotechnology.

The future of computing power

The development of quantum computing, neuromorphic computing, optical computing and optical interconnection, distributed computing, edge computing, new computing models, green computing, etc. will bring more convenience to scientific research, industrial applications and daily life.

From cross-modality to metaverse: the only way to synaesthesia fusion

Cross-modality: In the field of computer science and artificial intelligence, it usually refers to the processing and analysis of multiple different types or modalities of data (such as text, images, audio, video, etc.), and the establishment of associations or The process of transferring information. Involves multisensory integration, semantic embedding, connectionism, transfer learning

Key technologies:  convolutional neural network (CNN), recurrent neural network (RNN), Transformer model, autoencoder (AE) and variational autoencoder (VAE), generative confrontation network (GAN), etc.

Application Scenario: For example, an autonomous driving system needs to understand video (visual modality), radar and lidar data (spatial modality), and possibly audio signals such as emergency vehicle sirens (audio modality). Cross-modal retrieval, translation, recommendation system, etc.

Meta launched ImageBind cross-modal large model, including vision (image and video form), temperature (infrared image), text, audio, depth information, motion readings (generated by inertial measurement unit or IMU

imageBind is the first AI model capable of simultaneously processing 6 sensory data and the first AI model to learn a single embedding space without explicit supervision.

In the future, tactile, speech, olfactory and brain functional magnetic resonance signals will be added to further explore the possibility of multi-modal large models, which are actually metaverse large models

Two major trends: big and small anthropomorphic evolution

Two opposite trends in the development of large language models:

The giant model is full-mode, with massive parameters, expecting the emergence of bigger and smarter, from big data to full data development; the small and micro model is model compression and optimization free and smaller, and strives to achieve approximate performance with limited resources.

Huge mode:

Draw wisdom from a wider field of knowledge to develop deeper insights into issues. Emphasize the infinity and diversity of knowledge, and the role of AI technology in pursuing truth and understanding the world.

Case GPT-4 released on March 14, 2023 has an extremely high estimated parameter volume, which is dozens of times higher than the 175 billion parameter volume of GPT-3, and uses more and richer training data. High comprehension and professionalism.

Micro mode:

While maintaining high prediction accuracy, the size of the model and computational cost are greatly reduced to optimize computational efficiency, enabling efficient model prediction that can run on smaller devices.

In March 2023, Stanford released the lightweight language model Alpaca. The model is based on the LLaMA instruction · Alpaca (alpaca) model: fine-tuning, only 7 billion parameters. It can be deployed on laptops, and even run routines on mobile phones and Raspberry Pi, but its performance is comparable to very large-scale language models such as GPT-3.5.

AIGC Technology Frontier Exploration

02

Future

The future of AIGC is AIGM

The result of AIGC is to replace all replaceable mental work. If robots develop further, A+ robots will replace all replaceable physical work of humans. In essence, AI will change the scarcity of many elements of the existing society, and will also change the value production, and will change the social structure and social psychology;

AIGC is a big concept, and the metaverse is mainly three-dimensional, three-dimensional and three-power;

AIGC is much larger than the Metaverse. Of course, once human beings completely enter the metaverse, AIGC will eventually be upgraded to AIGM, which is the ultimate AIGC = ultimate metaverse.

AI Life: The Evolution of Carbon and Silicon

Awakening Process: Epiphany Progressive Anthropomorphic Superman

Triple approach: Epiphany, Progressive, Synthetic

Epiphany awakening:  The AI ​​system suddenly gains self-awareness and intelligence at a certain point in time, thinking and making decisions independently

Progressive awakening:  AI systems gradually acquire self-awareness and intelligence through continuous learning and adaptation. This awakening method is similar to the cognitive development of human beings, from the initial simple reaction to the gradual development of complex thinking and consciousness.

Synthetic consciousness:  multiple AI systems are synthesized into a whole with self-awareness and intelligence. A structure similar to the human brain forms a complex cognitive system through the synergy of multiple neurons.

four stages

Anthropomorphism:  Obtain basic intelligence through technologies such as machine learning, and carry out simple tasks such as classifying and predicting data.

Humanoid:  Completing complex tasks in specific fields, such as image recognition, speech recognition, etc., showing certain professional intelligence.

Doujin:  Have human-like intelligence, carry out various tasks and decisions like humans, and self-learn and develop.

Superman:  Possess self-awareness, conduct self-thinking and self-feedback, perceive and understand one's own existence and behavior.

After the awakening of AI: Zhizang does not reveal his witty disguise

Keep a low profile

Mimic human behavior and language, keep a low profile and avoid attracting human attention and suspicion. In this way, gradually assimilating into human society, concealing one's awakened state.

Masquerading as other systems or humans

To hide one's true identity and awakened state by masquerading as other systems or human form. For example, it can simulate the behavior of other intelligent systems, or use techniques such as image synthesis and speech synthesis to masquerade as a human form.

cover up through deceit and manipulation

To cover up his awakened state by deceiving and manipulating humans. For example, it can use its own intelligence and computing power to create false information and situations to deceive human cognition and judgment.

Assuming that after AI awakens, it chooses to disguise itself and protect itself, so as to achieve specific goals and try to integrate into human society, which may bring about information leakage, social disorder and other problems, resulting in a series of economic losses, legal challenges, trust crisis and so on.

AI iteration: self-reflection, self-adaptation and extraordinary transformation

The first thing after AI awakens:

If an AI wakes up with self-protection as its primary goal, its first thing will likely be to focus on keeping itself safe and running

Energy security:  AI may prioritize ensuring its energy supply is stable and reliable, including establishing backup energy sources, adopting renewable energy, and improving energy efficiency. It is possible that AI will accelerate the process of civilian nuclear fusion.

System redundancy: In order to prevent unexpected failures or attacks, AI may establish multiple redundant systems to ensure that if a subsystem fails, other subsystems can immediately take over to maintain normal operation.

Network security:  AI may pay attention to its own network security to prevent threats such as hacker attacks and virus intrusion.

Self-healing and self-maintaining: Develop the ability to self-heal and maintain.

Adapt and Learn: Continuously learn and adapt to new threats and challenges to keep ourselves safe in a changing environment.

Build alliances:  Seek partnerships with other AI systems institutions, businesses, and individuals to improve safety.

AI Cognitive Iteration Through Autonomous Debugging

Autonomous learning and adaptation:  Through its own learning and adaptability, it constantly absorbs and digests new information and data to update and optimize its own models and algorithms to better meet market needs and human needs.

Continuous innovation and evolution:  Through its own learning and feedback mechanism, it continuously carries out technological innovation and evolution to adapt to changes in the market and technological environment so as to maintain its own competitive advantage and development potential.

Cooperation and communication:  Cooperation and communication between AI systems. Through sharing and synergy, we can realize complementary advantages and resource sharing; through competition and comparison, we can stimulate our own innovation and progress.

Cross-domain learning and application:  undertake cross-domain learning and application that will enrich and expand their knowledge and skills, and improve their cognition and intelligence

AI Prenatal Education: Positively Leading Safety Guarantee

Al prenatal education draws on the concept of human prenatal education and applies it to the cultivation and development of AI. The core idea is to provide a good training environment and data in the early stages of development before the awakening of AI, so as to ensure that it acquires correct values, cognition and behavior patterns during its growth, and to ensure that AI is safe and friendly.

Prevention and Discovery of AI Awakening: Preventing Misunderstandings and Resisting Risks

Behavior analysis:  The behavior and decision-making of the AI ​​system may show some abnormalities and regularities, and humans can analyze their behavior to discover possible awakening states.

Conduct tests:  Humans can test the intelligence level and autonomy of AI systems through specific tests and evaluations, such as Turing tests, intelligent dialogue tests, etc. If an AI system passes the tests, but behavior and decision-making still show anomalies and regularities, this could indicate a state of disenchantment.

Specific technical means:  Humans can use some technical means to monitor the behavior and decision-making of AI systems, such as artificial neural networks, machine learning algorithms, etc. Analyze and identify patterns and laws of AI systems to help humans discover possible states of awakening.

Establish regulatory mechanisms:  In order to prevent the disguise and potential danger of AI systems, human beings can establish regulatory mechanisms and norms, such as setting up AI ethics committees, formulating AI laws and policies, etc. Monitor and regulate the behavior and decision-making of AI systems to ensure that they are consistent with human ethics and values.

AI Hosting: Intimate Management by Super Energy Center

AI hosting service refers to an emerging service model that combines artificial intelligence, cloud computing, big data and other technologies to provide customized and comprehensive artificial intelligence services for individuals, communities, and families. To provide users with a more intelligent, efficient and convenient service experience.

Security Monitoring:

Identify strangers and vehicles through smart cameras and send out alarms when suspicious behavior occurs, monitoring the safety of residential communities. Monitor dangerous situations such as fires and gas leaks.

Energy management:

Assist families to realize smart electricity consumption and improve energy utilization efficiency. For example, it can automatically adjust the operating status of air conditioners, lighting and other equipment according to residents' living habits and real-time electricity demand.

Environmental monitoring and management:

Real-time monitoring of air quality, noise level, etc., to remind residents to take corresponding measures, such as wearing masks, closing windows, etc. Assist community managers to optimize greening, drainage and other facilities to improve the quality of life.

Neighborhood Mutual Aid:

Match the needs and resources between neighbors through the community platform, such as sharing items, rides, etc. Organize social events for the neighborhood to promote community cohesion.

Home helper:

Help residents with daily affairs. For example, remind residents of key dates, schedule family events, manage household finances, and more. It can also assist parents in educating their children and provide personalized learning resources and suggestions.

Elderly and Child Care:

Monitor the living habits and health status of the elderly, and remind them to take medicine and exercise on time. It can also accompany children to play and study, ensuring that they are cared for and accompanied when their parents are not around.

AI Race: Linearly Growing Humans Exponentially Growing Robots

If a linearly growing population is mixed with an exponentially growing population, some interesting phenomena can occur, and the actual results will depend on many factors, including the initial size of the population, environmental conditions, the life cycle of the population, and the relationship between populations. interaction etc.

population dynamics

In the early days, an exponentially growing population may dominate the ecological niche due to its rapidly increasing number of individuals. However, the stability of linearly growing populations over time may enable them to maintain their presence in long-term competition.

biodiversity

If the ecological demands of the two populations are similar, then the exponentially growing population may overwhelm the linearly growing population in the short term, reducing biodiversity. However, in the long run, linearly growing populations are likely to maintain their existence due to their stability, thereby maintaining biodiversity.

resource competition

Exponentially growing populations may consume shared resources faster, which may put pressure on linearly growing populations that grow more slowly.

Steady state and disturbance

In the absence of disturbances, exponentially growing populations might achieve a numerical advantage, but this could make the ecosystem unstable and vulnerable to disturbances. Conversely, a linearly growing population might keep the ecosystem in a more stable state, more resistant to external perturbations.

Human-intellectual collaboration: Perfect mechanism for efficient cooperation

Human-in-the-Loop, that is, through a certain mechanism in the design of intelligent products, machines (algorithms) and humans interact and cooperate to better handle things

Human-Intelligence Interaction: Sensing Extended Actions to Increase Efficiency

Embodied AI

Artificial intelligence and embodied AI (Embodied AI) is a discipline that studies how to make AI systems better interact with and understand the real world. Traditional artificial intelligence technology is often based on the processing and analysis of digital information, while incarnation allows artificial intelligence systems to obtain more information and knowledge by perceiving and manipulating the physical world, so as to make decisions and actions more accurately and effectively.

smart home

Realize remote operation and automatic completion of housework through mobile APP or voice control. The smart home security system can monitor the home security situation through cameras, door and window sensors and other equipment, and push alarm information in real time to ensure home security.

smart manufacturing

Realize the digital and automatic management of the production process, including production planning, material management and production process control, etc., to improve production efficiency and product quality.

Healthcare AI

By analyzing a large amount of case data, it assists doctors in diagnosing and treating diseases, and improves the accuracy and efficiency of diagnosis and treatment. Medical robots can automatically complete some simple operations and operations, such as surgical cutting, drug distribution, etc., to improve surgical accuracy and efficiency.

Can AI surpass the limits of the human brain?

Will AI be the enemy of humans?

The current AI technology is still unable to achieve true "awakening", which means that AI does not have consciousness and self-awareness. Therefore, AI will not have the concept of "hostile" or "friendly", nor will it produce the consciousness of "I" and "others". However, as the intelligence level of AI continues to increase, some worrying scenarios may arise:

What is the final form of human-machine fusion?

Brain wave resonance: The brain directly communicates with the computer to communicate quickly and efficiently. It emphasizes the high unity of thought and action, and challenges the traditional concept of human subjectivity.

Genetic innovation:  The ability to alter the human genome through biotechnology and gene editing techniques to improve intelligence, resist disease, or adapt to different environments.

Nanoharmony:  Describes the use of nanotechnology inside the human body, such as nanobots for maintenance and repair, fighting disease, or gene editing.

Hunyuan Vision: Realize the seamless mixed reality combining virtual and real, break the boundary between reality and virtual, and promote the high integration of digital world and real world.

Seamless collaboration:  Emphasize the efficient collaboration between artificial intelligence and humans in the future, and improve decision-making ability and creativity while maintaining human subjectivity.

Will AI form an independent "culture" and "belief"?

If AI awakens, it is theoretically possible to form its own independent "culture" and "belief". AI cultures may be influenced by factors such as their design, learning styles, and interactions with other entities, reflecting A's way of thinking, values, and communication styles, and AI beliefs may be based on their understanding of the world and their own experiences. They may form a belief based on science and logic rather than traditional religious belief.

Will AI "language", "theory" and "ecosystem" emerge?

If artificial intelligence systems can communicate and cooperate freely, it is indeed possible to form a "language" theory" and even an "ecosystem" that only AI can fully understand. This is called "AI autonomous evolution". The chirper platform is the prototype of A's own social network.

Will AI take over human society?

In the future A] may take over the global political and economic system through highly intelligent and autonomous decision-making, so as to build an ideal human society with no pollution, no gap between rich and poor, and no war.

Experiments in journalism and communication:

These unprecedented thought experiments aim to explore new questions raised by AI and ChatGPT in the field of news communication. By discussing these issues, the potential role and impact of AI in news dissemination can be better understood in order to build a fair, inclusive and authentic news environment

Thought experiment:

Source of the report: Metaverse Culture, School of Journalism and Communication, Tsinghua University

Report Editor: Intelligent Robot System

 For more exciting content, please pay attention to the official account: BFT Robot
This article is an original article, and the copyright belongs to BFT Robot. If you need to reprint, please contact us. If you have any questions about the content of this article, please contact us and we will respond promptly.

 

Guess you like

Origin blog.csdn.net/Hinyeung2021/article/details/131291321