A detailed introduction to NLP dialog systems

Task-Based Dialogue System

Task-based dialogue systems are mainly used in fixed domains. There are two widely used methods for task-based dialogue, one is the modular method, and the other is the end-to-end method.
The modular approach treats dialogue responses as modules, each module is responsible for a specific task, and passes the processing result to the next module.
The end-to-end task-based dialogue system no longer independently designs each sub-module, but directly learns the mapping relationship between dialogue context and system reply, and the design method is simpler. Related research can be divided into two categories: retrieval-based methods and generation-based methods.
insert image description here
The main task of the (NLU) module is to map the natural language input by the user to the user's intent and slot . The main function of this module is also to output the user intent, slot and slot value corresponding to the slot in the form of a triple, and output it to the dialogue state tracking module.

[Case Analysis]
User input: "What's the weather in Beijing today?"
User intent definition: Ask the weather
Slot definition:
Slot 1: Time
Slot 2: Location
We have defined two necessary slots for the task of "Ask the weather". "time and place".
insert image description here

intents and slots

Intention recognition is to classify the natural language conversations input by the user into categories. The category corresponds to the user's intention classification, such as "what time is it now", and its intention is "ask the time". "What's the temperature today", "Ask about the weather". In fact, intent recognition is a typical classification problem , and it is generally a multi-classification problem. That is our commonly used text classification task.
Method of intent recognition: Based on algorithms such as SVM\TextCNN, intent recognition is also realized through sequence labeling, feature extraction and other methods, and then model training.

The slot is the parameter corresponding to the intent. After the intent is determined (whether it is asking about the weather or querying books), an intent can correspond to several parameters. For example, when searching for a book in an online library, necessary parameters such as the title, author, and publisher need to be given. The above parameters are the slots corresponding to the intent of "querying books".
Slot filling method: Slot filling is a sequence labeling problem, which can be realized with algorithms such as machine learning HMM, SVM, and deep learning RNN.

Dialog State Tracking (DST)

Slot filling may be done through multiple rounds of dialogue. The state tracking module includes various information of continuous dialogue. database query) to update the current dialog state.

The system has multiple rounds of dialogues with users, gradually clarifying user needs, and the process of users expressing needs is a continuous slot-filling process.
insert image description here

Dialog Policy Learning (DPL)

Dialogue policy learning (DPL) is used to decide which reply strategy the model adopts in the current state.
After the user enters a sentence, it will be processed by the NLU and DST modules to form 3 result vectors. Among them, the three vectors of intent, slot, and slot value will be concatenated into a high-latitude feature vector, which will be input into the LSTM model.
LSTM outputs the probability distribution of each system action at the current moment of the system through the Softmax activation function, and then obtains a series of action processing and specific generation templates such as the index of the element with the highest probability through probability maximization, and generates a reply from the system and returns it to the user.
insert image description here

Natural Language Generation (NLG)

Natural Language Generation (NLG) aims at converting semantic representations into natural language utterances.
There are three main generation methods for natural language generation:
(1) Template-based methods.
By pre-setting the template and performing natural language generation, it is returned to the user. For example, a flight ticket from {origin} to {destination} at {time} has been booked for you.
(2) Method based on grammatical rules.
Similar to the method mentioned in natural language understanding, it judges whether the user's sentence is an interrogative sentence or a statement sentence, etc., and generates natural language content in combination with part of speech.
(3) Generation-based methods.
By relying on models such as Seq2Seq, the context vector is calculated; the final Decoder part combines the input state of the Encoder, the context vector, and the historical input of the Decoder to predict the probability distribution of the current output corresponding to the dictionary.

In the dialogue system, a large number of question templates are first extracted from the training corpus, and a corresponding template pool is formed. From the question set, there is a many-to-many correspondence between the relationship vector and the answer template pool.

Small talk dialogue system

Small talk dialog systems are also known as open domain dialog systems , or chatbots.
Chat-style dialogue systems are task-free, developed for pure chat or entertainment, and its purpose is to generate meaningful and relevant content responses.
The open-domain gossip dialog systems are mainly divided into retrieval and generative dialog systems. In the retrieval dialogue system, the method of querying and sorting the user's input information, and then giving the best answer. Generative dialogue systems generate new texts as answers through existing corpus. At present, the research attribute of the generative chat dialogue system is greater than the application attribute, and many problems are still in the exploration stage.

Retrieval dialog system

The retrieval dialogue system retrieves and sorts the utterances that match the user's query from the predefined database through matching technology, and selects the highest-ranked reply. The core is how to build a better query-reply matching model.

Case Study : The Simplest Dialogue System

import random
# 提问数据存储
greetings = ['你好', 'hello', 'hi', '早上好', 'hey!','hey']
# 随机函数选择提问
random_greeting = random.choice(greetings)
# 对于“你好吗? ”这个问题的回复
question = ['How are you?','你好吗?']
# “我很好 ”
responses = ['Okay',"I'm fine"]
# 随机选一个回
random_response = random.choice(responses)
# 死循环让程序不断接受提问,当输入“再见 ”程序退出
while True:
    userInput = input(">>> ")
    if userInput in greetings:
        print(random_greeting)
    elif userInput in question:
        print(random_response)
    # 当你说“拜拜 ”程序结束
    elif userInput == '再见':
        break
    else:
        print("我不知道你在说什么")

The output is as follows:

>>>你好
早上好
>>>你好吗?
我不知道你在说什么?
>>>你好吗?
I’m fine

The architecture process of the retrieval chat system usually goes through several processes such as problem analysis, rule system, rough sorting, and fine sorting.
Finally the response with the highest score in the corpus is selected.
insert image description here

rough model

The rough sorting model is a method of retrieval question answering, which aims to quickly retrieve similar questions from the corpus. To understand the coarse sorting model, you first need to understand the inverted index.

Inverted index is often called inverted index. The inverted index disrupts the forward sorting of the corpus, so it is called the inverted index. Generally speaking, the document id is used as the index, and the document content is used as the record; while the inverted index uses the document content as the index, and the document id is used as the record.
insert image description here
The record corresponding to Yanshan University is 0, indicating that the word Yanshan University is contained in the 0th document.
insert image description here
Coarse sorting model :
After the inverted index, you can quickly find the documents containing the words, which greatly improves the search speed. The order of content containing words, prepositions, etc. is at the back. For words that appear less often, the model thinks that they may be key sentences, so they need to be ranked in the front row.
Rough sorting model implementation: Sort in ascending order according to the number of occurrences of words in the question sentence, and take the top candidate question sentence as the result of rough sorting.

Refined model

In the coarse sorting model we compare words by word. For the fine-tuning model, we want to compare the similarity of two question-and-answer sentences. It is not difficult to find that the order of the query words is not taken into account, and "I invite you to dinner" and "You invite me to dinner" are both a result in the rough sorting model.
And the inverted index also relies heavily on word segmentation. For example, "Yanshan", "University" and "Yanshan University" in the above example are two different word segmentations, and the results are completely different.
The main solution to the word segmentation problem, the practice in the past two years is to input characters as a unit, and the method of using characters as an input unit gradually exceeds the effect of using words as a unit. The problem of word order can be solved by deep learning network.
Sentence2Vector refined model structure :
insert image description here

Sentences are input into the LSTM model, and the LSTM model processes sentences into time series information and establishes associations for each word in the sentence. Take the output of each word at the previous moment as the input at this moment. In this way, "you invite me to dinner" and "I invite you to dinner" have different word sequences, which solves the problem of word order and word segmentation left in the rough typesetting model.
Self-Attention is to assign different attention weights to different words in the process of information processing.
The Dense module converts the output of the Self-Attention layer into a vector, and performs the vector inner product of the question sentence and the question sentence in the corpus respectively, so as to obtain the similarity between the two sentences.

The Sentence2Vector model is basically similar to the CBOW model using multiple words to predict a word structure, except that Sentence2Vector replaces words with sentences, and is a model that vectorizes sentences and can calculate similarity at the same time.
insert image description here

Guess you like

Origin blog.csdn.net/zag666/article/details/128283178