How to enhance the learning ability of large language model LLM?

As we all know, the ChatGPT model learns knowledge until 2021, so when we ask ChatGPT what will happen after 2021, ChatGPT will often give completely wrong answers. Some questions, such as historical or literary questions, sometimes give some wrong answers, which may be caused by the ChatGPT model not learning relevant knowledge.

 

 But Baidu search can give us the correct answer:

How to make ChatGPT learn knowledge after 2021?
How to make ChatGPT not make common sense mistakes? 

Because Baidu search can provide the latest and most timely information, we can develop a web crawler to crawl Baidu search results, and then send it to ChatGPT for learning, so that ChatGPT can grasp the latest and most timely information, and ChatGPT may not No more common sense mistakes!

Develop a web crawler to crawl the results of Baidu search 

Here our idea is to develop a Baidu search web crawler first. When users ask ChatGPT about things that will happen after 2021, we can first conduct a Baidu search on the question, and after getting the search results, we can feed it to ChatGPT for learning. , so that ChatGPT can learn the latest and most timely information, so as to avoid ChatGPT giving wrong answers, but first we need to install two python packages:

pip install openai
pip install selenium 

Next, we need to send a Baidu search web crawler function: scraping_data(), which uses the parameter question to perform Baidu search in the background browser. After getting the search results, we extract the class name equal to "c-border" in the tag Text information, the information in the c-border tag is generally the information at the top of Baidu search results. These top information are accurate information refined by Baidu:

 We first get the top information, if the top information does not exist, then we get the content of each remaining search result, and all the search result content is stored in the tag with the class name equal to "'content-right_8Zs40'":

import openai
from selenium import webdriver
from selenium.webdriver.common.by import By
import time
import warnings
warnings.filterwarnings("ignore")

#爬取百度搜索的结果
def scraping_data(question):
    options = webdriver.chrome.options.ChromiumOptions()
    options.headless=True
    url="https://www.baidu.com/s?wd="+question
    brower=webdriver.Chrome(options=options)
    brower.get(url)
    brower.execute_script("window.scrollTo(0, document.body.scrollHeight)")
    time.sleep(3)

    data=[]
    #爬取 c-border的内容
    border=brower.find_elements(By.CLASS_NAME ,'c-border')
    for item in border:
        data.append(item.text)
    #如果c-border数据不存在,则爬取 content-right_8Zs40 的内容
    if len(data)==0:        
        content=brower.find_elements(By.CLASS_NAME ,'content-right_8Zs40')
        for item in content:
            data.append(item.text)
    return data

question='2022年世界杯冠军是那支球队?'
result=scraping_data(question)
print(result)

 Combine LLM with web crawler

Here we want to combine a large language model such as ChatGPT with a web crawler. What we need to do is to first search Baidu for the questions raised by users, and then feed the search results and current questions to ChatGPT, so that ChatGPT can learn the latest and most Timely information so that wrong answers are no longer given.

#申请的api_key
openai.api_key = "你的API_key"
def get_answer(data):
    response = openai.Completion.create(
    model="text-davinci-003",
    prompt="\n".join(data),
    temperature=0.5, 
    max_tokens=2048)    
    return response.choices[0].text
 
def ask_question():    
    flag=True
    print()
    greeting="\033[1;31mChatGPT: 我是ChatGPT聊天机器人,我可以回答您的任何问题!如果您想退出,请输入:quit\033[0m"
    print(greeting)
    print()
    while(flag==True):
        question = input()
        if(question!='quit'):
            #爬取百度搜索的结果
            data = scraping_data(question)
            data.append(question)
            #将百度搜索的结果和当前问题一起喂给ChatGPT            
            answer=get_answer(data)
            answer = answer[1:]
            print(f"\033[1;31mChatGPT:{answer}\033[0m")
            print()
 
        else:
            flag=False
            print()
            print("\033[1;31mChatGPT:后会有期,bye!\033[0m")  

ask_question()

 

 Here we can find that when we ask ChatGPT about the 2022 World Cup champion and runner-up teams, ChatGPT can always give the correct answer, because ChatGPT has learned our Baidu search results and extracted the correct answer from them.

Summarize

Since the knowledge learned by the current ChatGPT model is as of 2021, when users ask ChatGPT what will happen after 2021, ChatGPT will often give strange and wrong answers. In order to avoid such problems, we can use Baidu The results of the search are used to let ChatGPT learn, which can greatly reduce the probability of ChatGPT giving wrong answers.

Guess you like

Origin blog.csdn.net/weixin_42608414/article/details/129151652