Li Yunlong and Xie Erduo are professional escorts! ChatHaruhi explodes: supports 32 Chinese and foreign characters, open source 54,000 conversation data

Source | Xinzhiyuan
Author | LRS

"Role playing" is one of the more interesting application scenarios of large-scale language models. LLM can talk to users in the tone of a specified character, and can also achieve time-space communication such as "Jobs and Socrates."

Many companies have also released role-playing products based on language models, such as Glow, Character.AI, etc. Users can easily create a "cyber wife", which has many potential application scenarios in games, creative industries and other fields.

Recently, a fun role-playing chat system "Chat Suzumiya Haruhi" was open sourced on GitHub. It is based on the Chinese language model "Luotuo" and can imitate the conversational styles of 32 Chinese and foreign characters. It also provides more than 54,000 conversation data sets.

Project link:
https://github.com/LC1332/ChatHaruhi-Suzumiya
Paper link:
https://arxiv.org/abs/2306.09479

At present, the project is still in progress and provides a wealth of demo methods.

Demo link:
https://huggingface.co/spaces/chenxiYan/ChatHaruhi-OpenAI

Users can give themselves a random name (preferably close to the original work), and then enter questions to start communication. For example, when playing "Xiu Qin" and talking to Li Yunlong, you can see that the model's multiple rounds of answers are very good, and the simulated dialogue is The style is also very relevant.

Large model research test portal

GPT-4 portal (no wall, can be tested directly, if you encounter the browser warning point, just advance/continue accessing):
https://gpt4test.com

The basic idea

In open source role-playing implementations, users typically type in the prompt:

I want you to act like {character} from {series}. I want you to respond and answer like {character} using the tone, manner and vocabulary {character} would use. Do not write any explanations. Only answer like {character}. You must know all of the knowledge of {character}. My first sentence is "Hi {character}.

I want you to act like [character] in [TV series]. I want you to respond and answer like [character], using the tone, manner, and vocabulary that [character] would use. Don't write any explanation. Just answer like [the character]. You have to know everything there is to know about [the character]. My first words were "Hello, [character]."

In this simple way, the language model can show some role-playing capabilities, but this method relies heavily on the existing knowledge of the language model itself and cannot play roles that are vague in memory or outside the corpus.

Moreover, the definition of "knowing everything about the character" in the prompt is very vague, and the model will still produce hallucinations.

Even if the prompt content is clear enough, it will still be affected by the underlying language model during the text generation process. Continuing to adjust the prompt words may alleviate this situation, but the workload may be very heavy when there are too many characters.

It is also an idea to fine-tune the model on the character's dialogue data, but the researchers found that fine-tuned chatbots will produce more hallucination problems; and for the secondary characters they want to imitate, it is difficult to collect enough data for fine-tuning. .

The goal of the ChatHaruhi project is to enable language models to simulate character styles in various genres such as anime, TV series, and novels. The developers believe that a virtual character mainly consists of three components:

1. Knowledge and background

Each virtual character exists in its own setting background. For example, the characters in "Harry Potter" exist in the magical world, Haruhi Suzumiya lives in a high school in Japan, etc.

When building a chatbot, we hope that it can understand the setting of the corresponding story, which is also a major test of language model memory and usually requires an external knowledge base.

2.Personality

The character's personality is also a very important part of literary and artistic works and must remain coherent or consistent throughout the work. Some authors will even define the character's personality before writing.

3. Linguistic habits

Language habits are the easiest to imitate, as long as appropriate examples are given in the context.

The key idea of ​​the project was to extract as much of the original script as possible to form a memory database for the target character.

When the user asks a new question, the system will search for relevant classic plots, combine it with the prompt words set by the character, and control the language model to better imitate the character.

The researchers also designed a system to automatically generate dialogue that fits a character's personality, generating enough data to fine-tune even a character with less original dialogue.

ChatBot design

Given a specific role R and a query question q, the chat task can be modeled as: the probability of generating a reply under the conditions of knowledge background, personality and language habits Θ:


Character R can be specified by a prompt text (I want you to act like character from series...).
Similar to in-context learning, the character's previous dialogue sequence can be used as a probability condition:

For characters with a larger worldview, the most relevant dialogue sequences need to be retrieved from the memory bank first.

System Prompt

Among the previously mentioned general role-playing tips that can be used with ChatGPT, there are two areas that need improvement:

1. Won't repeat line
When models such as ChatGPT and LLaMA2 are trained with reinforcement learning based on human feedback (RLHF), they face problems such as "Give me m different options" and "Generate m titles" When waiting for a task, there is a preference not to repeat content in context.

When imitating a character, you need to emphasize in your prompts that you can reuse classic lines from novels or movies.

2. The character’s personality is not outstanding enough

Due to the RLHF mechanism, each language model has its own specific language preferences, which will affect the effect of imitation. Adding the personality description of the character at the end of the prompt word can improve it.
The improved prompt words are as follows:

I want you to act like {character} from {series}. You are now cosplay {character} If others' questions are related with the novel, please try to reuse the original lines from the novel. I want you to respond and answer like {character} using the tone, manner and vocabulary {character} would use. You must know all of the knowledge of {character}. {Supplementary explanation of the character's personality} I hope you behave like [character] in [TV series] . You are cosplaying [character] now. If someone else's question relates to the novel, try to reuse the original lines from the novel. I want you to respond and answer like [character], using the tone, manner, and vocabulary that [character] would use. You have to know everything there is to know about [the character]. {Additional notes on character personality}

Character dialogue In order to better reproduce the behavior of characters in novels, TV series, and movies, researchers have collected a large number of excerpts from classic scripts. However, except for a few characters (such as crosstalk actor Yu Qian), not all dialogues are in the form of questions and answers. They can be Historical records are organized into stories.

Original conversation search

After inputting the query, the most similar M samples are selected from the dialogue database D according to the embedding similarity. The specific number of M depends on the token limit of the language model.
When building a conversational memory, the researchers recommend keeping each story short enough to avoid taking up space from other stories during the search process.

Chat memory

For a certain memory, each user query and chatbot response needs to be recorded, and then all dialogue sequences are fed into the language model to ensure the coherence of the dialogue.
In actual implementation, starting from this memory, the total number of tokens is calculated forward, and the dialogue history input of the language model is limited to 1200 tokens, which can accommodate approximately 6-10 rounds of dialogue.

dialogue synthesis

Currently, ChatHaruhi can only use ChatGPT or Claude API to complete role playing. If users want to migrate this capability to a local model, they still need to construct a suitable data set.
Generate conversations from questions

Since the collected data is not strictly in the form of question and answer, the researchers chose to treat all dialogues before the first sentence of the target character in the collected stories as questions and input them into the language model to generate subsequent dialogues.

problem generation

It should be noted that some characters have very limited data and are not sufficient to fine-tune the language model.

In order to perform data augmentation on role questions from existing data, researchers use models such as Alpaca to first provide a clear question and answer pair, and then generate about 10 heuristic outputs; then use predefined prompts to regenerate the training data for the role .

The researchers collected a total of 22,752 original conversation data and 31,974 simulated questions to form the ChatHaruhi-v1 data set.

Experimental results

At present, the quantitative experimental results and user research are still in progress. This article only briefly compares the various models qualitatively:

1. GPT Turbo 3.5, only use system prompts

2. Turbo 3.5, enter completed prompts, conversation history, questions, etc.

3. ChatGLM2, only enter system prompts

4. ChatGLM2, give complete prompts

5. ChatGLM2, fine-tuned on ChatHaruhi data, input complete prompts

Guess you like

Origin blog.csdn.net/xixiaoyaoww/article/details/132859915