Is ChatGPT useful for knowledge graphs? It answers itself like this...

From search engines to personal assistants, we use question answering systems every day. A question answering system must be able to access relevant knowledge and reason about it. Often, knowledge can be implicitly encoded in large language models (LLMs), such as ChatGPT, T5, and LaMDA, which are pretrained on unstructured text, or explicitly represented in knowledge graphs (KGs ), such as OpenKG and ConceptNet, where entities are represented as nodes and relationships between them are represented as edges.

Recently, pretrained LLMs have achieved remarkable success in many question answering tasks. The field is changing rapidly, and advances in algorithms are having a very significant impact. Then without doubt, is the knowledge map used in the ChatGPT training process that has attracted much attention ? ChatGPT gave different answers at different times (February and March):

Consultation time February 2023:
picture
Figure 2 Inquiry time 2023.03:
insert image description here

So, is the success of ChatGPT just a reproduction of high-probability language patterns? Why use a knowledge graph? Simply put, being data-driven alone is not enough, but knowledge-driven organizations can make decisions with full context and be confident in their decisions.
First, let's take a look at what you should know about ChatGPT.

1. Large language model

Over the past few years, large language models (LLMs) have developed an astonishing skill at generating human language. The figure below shows the scores of popular LLMs on human cognitive ability,
picture

How LLMs score on human cognitive abilities (Source: Semantic analysis of about 400,000 AI-related online texts since 2021)

Language models solve question answering tasks using text generation. According to the training data set, the language model can be divided into:

  • (i) Generic models such as PaLM, OPT and GPT-NeoX-20B;
  • (ii) Domain-specific models such as Galactica, SciBERT, and BioMegatron. More advanced conversational AI (Conversational AI) models have benefited from recent advances in language models to create chatbots that can answer questions in conversation with users. For example, ChatGPT, an OpenAI-based chatbot, has received a lot of attention. GPT, which stands for Generative Pretrained Transformer, is an AI algorithm that can create new content based on ingesting large amounts of text and data and deriving linguistic rules and relationships. The text generated in response to input can be very subtle and creative, giving the impression that you are talking to someone. Unlike a search engine, instead of simply retrieving information, it generates information based on rules and relationships derived from large amounts of data processed through algorithms. The success of ChatGPT has benefited from a series of technologies and data, as follows:

2. What is the use of Transformer? Why is it so popular?

Transformers are used in a variety of natural language processing (NLP) tasks, such as language translation, sentiment analysis, text summarization, question answering, and more. The original Transformer model was specifically designed for language translation, primarily from English to German. However, it has been found that the architecture can be well adapted to other language tasks. This trend was quickly noticed by the research community. Over the next few months, nearly all leaderboards for language-related machine learning tasks were dominated by some version of the Transformer architecture. Therefore, Transformers are very popular. Huggingface is a startup that has raised over $60 million to date, almost entirely around the idea of ​​commercializing their open source Transformer library.

The following three pictures are used to first intuitively feel the relationship, timeline and size of the Transformer family model. The first diagram aims to highlight the different types of Transformers and the relationship between them.
insert image description here

The second image, the timeline view, is an interesting angle that sorts the Transformers in the catalog by release date. In this visualization, the Y axis is only used to cluster related family Transformers.
insert image description here
In the next visualization, the Y-axis represents the model size in millions of parameters.
insert image description here
One of the key reasons Transformers have been able to take over most NLP leaderboards so quickly is their ability to quickly adapt to other tasks, aka transfer learning. Pretrained Transformer models can be very easily and quickly adapted to the tasks for which they were not trained, which brings a huge advantage.

3. An important concept for the success of Transformer

One aspect of Transformer's success is RLHF (Reinforcement Learning with Human Feedback) in language models. RLHF has become an important part of artificial intelligence. This concept was proposed in the paper "Deep reinforcement learning from human preferences" as early as 2017. However, recently it has been applied to ChatGPT and similar dialogue systems such as BlenderBot3 or Sparrow. The idea is very simple: once a language model is pretrained, we can generate different dialogue responses and let humans rank the results.

During the ChatGPT training process, OpenAI literally had humans role-play with themselves—acting as both the AI ​​assistant and its user through a process called reinforcement learning with human feedback (RLHF). Then, after enough conversations have been constructed, they are fed to GPT-3.5. After being fully exposed to conversations, ChatGPT came into being.

picture

The following example illustrates how to understand RLHF?

Imagine you have a robot named Rufus who wants to learn how to talk like a human. Rufus has a language model that helps him understand words and sentences. First, Rufus will use his language model to say something. For example, he might say "I am a robot".

A human would then listen to what Rufus said and give him feedback on whether it sounded like a natural sentence a human would say. A human might say, "That's not quite right, Rufus. Humans don't usually say 'I am a robot.' They might say 'I'm a robot' or 'I am a machine.'"

Rufus will take this feedback and use it to update his language model. He will try to say the phrase again using the new information he has received from humans. Humans will listen again and give Rufus more feedback. This process will continue until Rufus can speak sentences that sound natural to humans.

Over time, Rufus will learn how to speak like a human thanks to the feedback he receives from humans. This is how language models are improved using RL and human feedback.

picture

4. Training data

Let's illustrate the training data by comparing OpenAI's ChatGPT and Google's Bard. Both ChatGPT and Bard have unique training styles. Specifically, ChatGPT runs on the GPT-3.5 model, while Bard runs on LaMDA2. We can think of GPT-3.5 as the "brain" of ChatGPT, and LaMDA2 as Bard's. The main commonality between them is that they are both built on top of Transformer. But as far as we know, that's where the common ground ends.

Now the difference comes, the main thing is that they read differently. OpenAI has kept the GPT-3.5 training dataset a secret. But we do know that both GPT-2 and GPT-3 were trained at least in part on The Pile dataset - a collection of multiple full-length fiction and non-fiction books, texts from Github, all of Wikipedia, StackExchange, PubMed, etc. . This dataset is huge, over 825 GB of raw text.

But here's the thing: Conversational language is not the same as written language. A writer can be passionate in writing, but stilted in one-on-one conversations. So, OpenAI can't just release GPT-3.5 under the alias "ChatGPT" and call it a day. Instead, OpenAI needed to fine-tune GPT-3.5 on conversational text to create ChatGPT, a language service model based on InstructGPT.

This is where some might think Bard has an advantage. LaMDA was not trained by The Pile. Instead, LaMDA has focused on reading conversations from the start. It doesn't read books, it models the rhythms and dialects of conversation. As a result, Bard captures the details that distinguish open dialogue from other forms of communication.

In other words, while ChatGPT's brain first learned to read novels, research papers, code, and Wikipedia before learning how to have human-like conversations, Bard learned only conversations.

Typical chatbot(Bert) GPT-3 LaMDA
on subject-specific datasets unlabeled text dataset unlabeled text dataset
Provide answers only from training data 175 billion parameters, based on Wikipedia, novels and other data 137 billion parameters, based on conversation data, no topics
limited dialogue flow limited dialogue flow open dialogue

5. What resources are needed to train localized ChatGPT?

  1. Training hardware: A supercomputer with about 10,000 GPUs and about 285,000 CPU cores is used. Or it could cost them $1 billion (USD) to rent it, as OpenAI did to Microsoft.
  2. Staffing: In 2016, OpenAI paid Chief Scientist Ilya Sutskever $1.9 million (USD) per year, and they have a team of 120 people. The staffing budget for the first year may exceed $200 million.
  3. Time (data collection): It took EleutherAI a full 12-18 months to agree, collect, clean and prepare the data for The Pile.

(4) Time (training): Expect a model to take 9-12 months to train, if all goes well. You may need to run it multiple times, and you may need to train multiple models in parallel. (See GPT-3 paper, China's GLM-130B and Meta AI's OPT-175B journal).

In summary, quite powerful computer and R&D human resources are required.

Six, how to write a prompt (promot)?

In a large language model (LLM) like ChatGPT, prompts can range from simple questions to complex questions with various data (note that you can even have CSV files of raw data as part of the input). It can also be a vague statement like "Tell me a joke, I'm feeling down today."

Promote can be composed of any of the following components: Instructions, Question, Input data, Examples. Basic combination examples are as follows:

Instructions + Input data: I graduated from Tsinghua University. I am an algorithm engineer by profession. I have done many tasks related to NLP. Can you help me write a resume?

picture
Question + Examples: I like reading "Pride and Prejudice", can you recommend other similar books?

picture
Instructions + Question: Where can ChatGPT be improved?
picture

Combining large-scale language models with knowledge graphs is also a new direction for improvement. By integrating the knowledge graph into a conversational AI system, ChatGPT can leverage the structured data and relationships represented in the graph to provide more accurate and comprehensive responses. The knowledge graph can serve as a source of domain-specific knowledge, which can be used to enrich the responses of ChatGPT and enable it to handle complex user queries that require deep domain expertise.

7. Book recommendation

insert image description here

"Knowledge Graph Practical Combat: Construction Method and Industry Application" by Yu Jun, Li Yajie, Peng Jiaqi and Cheng Zhiyuan

Recommended language: Written by experts from HKUST Xunfei, and recommended by many domestic experts, master the construction method and mainstream application of knowledge graph in one book! Explain in detail the 7 core steps of knowledge map construction, analyze the CCKS question and answer evaluation task plan in recent years, and dismantle the design and implementation of 8 comprehensive industry cases .

brief introduction:

  • This is a book that comprehensively introduces the construction of knowledge graphs and industry practices. It is a summary of the author's years of experience in knowledge graphs and cognitive intelligence applications, and has been recommended by many knowledge graph senior experts.
  • This book explains the knowledge related to knowledge graph in an easy-to-understand way, especially gives a more detailed explanation of the steps that need to be experienced in the process of building a knowledge graph from scratch, as well as the issues that need to be considered in each step.
  • This book abstracts based on actual business, combines the 7 construction steps of knowledge graph, and deeply analyzes the application of knowledge graph technology and the design and implementation of 8 comprehensive industry cases.
  • The book is divided into basic chapters, construction chapters, and practical chapters, with a total of 16 chapters.
基础篇(第1章),介绍知识图谱的定义、分类、发展阶段,以及构建方式、逻辑/技术架构、现状与应用场景等。

构建篇(第28章),详细介绍知识抽取、知识表示、知识融合、知识存储、知识建模、知识推理、知识评估与运维等知识图谱构建的核心步骤,并结合实例讲解应用方法。

实践篇(第916章),详细讲解知识图谱的综合应用,涵盖知识问答评测、知识图谱平台、智能搜索、图书推荐系统、开放领域知识问答、交通领域知识问答、汽车领域知识问答、金融领域推理决策

references:

1.《Transformer models: an introduction and catalog》;

2.《ChatGPT versus Traditional Question Answering for Knowledge Graphs: Current Status and Future Directions Towards Knowledge Graph Chatbots》;

3. https://blog.deepgram.com/chatgpt-vs-bard-what-can-we-expect/

Guess you like

Origin blog.csdn.net/weixin_63866037/article/details/130206448