Tsinghua Tang Jie's new work WebGLM, 10 billion parameters can be connected to the Internet

WebGLM is a networked question-and-answer chat robot based on the large language model (LLM). Its author is the team of Professor Tang Jie from the Department of Computer Science, Tsinghua University. Its paper was selected for the KDD2023 conference.

WebGLM is a large language model (LLM)-based online question-answering chatbot

The characteristic of WebGLM is that it can utilize Web search and retrieval functions to enhance the ability and generalization of LLM, so as to realize an efficient, reliable, and multifunctional networked question-answering chatbot. It mainly consists of three components: a large model augmented retriever, a bootstrap generator, and a scorer based on human preference.

WebGLM has achieved significant improvements on multiple publicly available question answering and chat datasets, proving its effectiveness and superiority. For example, on the TriviaQA dataset, WebGLM's EM (exact match) and F1 (average match) reached 67.8% and 76.2%, respectively, 5.6% and 4.2% higher than OpenAI WebGPT.

WebGLM is a breakthrough research achievement that shows how to leverage the Web as a huge knowledge base to provide LLM with rich and diverse information sources, thereby improving the performance and generalization ability of LLM on question answering and chat tasks. It also provides a new idea and direction for future LLM research.

What is WebGLM?

WebGLM is a networked Q&A chatbot based on a large language model (LLM). Its goal is to enhance the pre-trained large language model through Web search and retrieval functions, while enabling efficient actual deployment.

The core idea of ​​WebGLM is to use the Web as a huge knowledge base to provide LLM with rich and diverse information sources, thereby improving the performance and generalization ability of LLM on question answering and chat tasks.

WebGLM mainly consists of three components: a large model augmented retriever, a bootstrap generator, and a scorer based on human preferences. Let's introduce these three components separately.

Large Model Augmented Retriever

The large model enhanced retriever is used to enhance the retrieval capability of model-related network content, and find relevant references in the case of a given query, so as to better and accurately answer questions later.

It has two stages: coarse-grained web search and fine-grained LLM-enhanced dense retrieval.

Coarse-grained web search refers to the use of traditional web search engines such as Bing to obtain a list of web pages relevant to a query. This step quickly narrows down your search and filters out irrelevant or low-quality web pages.

Fine-grained LLM-enhanced dense retrieval refers to using a pre-trained LLM (such as GLM-130B) to encode each webpage in the webpage list and calculate the similarity with the query encoding. This step can further improve the retrieval effect and find out the most relevant and valuable web pages.

bootstrap generator

The bootstrap generator uses the ability of GLM (such as the bilingual open source pre-training model GLM-130B released by Tsinghua University) to generate responses to questions and provide detailed answers.

Using this generator, the authors obtain WebGLM-QA - a LLM bootstrap citation and long-range QA dataset. It is cleaned and filtered through strategies such as context learning, and finally includes 45k high-quality filtered samples and 83k noise samples. The backbone of WebGLM is a GLM model trained on this dataset.

The bootstrap generator can generate multiple possible replies based on the query and retrieved web page content, and sort and select according to some rules. For example, give preference to replies that contain quoted information, are moderate in length, grammatically correct, logically coherent, informative, and non-repetitive.

Scorers based on human preferences

Human preference-based scorers evaluate the quality of generated responses by prioritizing human preferences over costly expert feedback, ensuring the system produces useful and engaging content.

The authors used a method based on contrastive learning to train a scorer model by collecting relative human preferences for different responses. The model can give a composite score for responses based on some characteristics, such as relevance, accuracy, fluency, variety, interestingness, etc.

The human preference-based scorer can be used as a post-processing module to filter and optimize the output of the bootstrap generator to improve user experience and satisfaction.

How is the performance of WebGLM?

The authors evaluate WebGLM on multiple publicly available question answering and chat datasets and compare with other state-of-the-art models. The results show that WebGLM has achieved significant improvements in various indicators, proving its effectiveness and superiority.

For example, on the TriviaQA dataset, WebGLM's EM (exact match) and F1 (average match) reached 67.8% and 76.2%, respectively, 5.6% and 4.2% higher than OpenAI WebGPT.

On the Persona-Chat dataset, WebGLM's PPL (perplexity) and BLEU (language quality) reached 9.6 and 2.1, respectively, which are 0.4 lower and 0.1 higher than OpenAI WebGPT, respectively.

In addition, the author invited some real users to interact with WebGLM and collected their feedback. Users generally regard WebGLM as a fun, intelligent, friendly, and useful chatbot that can provide rich and accurate information and can adapt to different topics and scenarios.

What is the point of WebGLM?

WebGLM is a groundbreaking research achievement that demonstrates how to augment pre-trained large language models with web search and retrieval capabilities for an efficient, reliable, and versatile networked Q&A chatbot.

WebGLM can not only provide users with fast and accurate answers, but also generate interesting and useful content for users, thereby improving users' knowledge level and entertainment experience.

WebGLM also provides a new idea and direction for future LLM research, that is, how to use external knowledge sources to enhance the ability and generalization of LLM, and how to use human preferences to optimize the output quality of LLM.

In short, WebGLM is a networked Q&A chatbot worthy of attention and learning, and it may become a new benchmark in the future LLM field.

Information Source

(1) WebGL Overview – The Khronos Group Inc. https://www.khronos.org/webgl/.

(2) WebGL: 2D and 3D graphics for the web – Web APIs | MDN – MDN Web Docs. https://developer.mozilla.org/en-US/docs/Web/API/WebGL_API.

(3) What is WebGL and how to enable it in any browser. https://www.dz-techs.com/pt/enable-webgl-any-browsers.

(4) WebGL – Web API Interface Reference | MDN – MDN Web Docs. https://developer.mozilla.org/zh-CN/docs/Web/API/WebGL_API.

(5) WebGL – Wikipedia. https://en.wikipedia.org/wiki/WebGL.

Guess you like

Origin blog.csdn.net/virone/article/details/131395440