TavernAI+KoboldAI local deployment realizes AI dialogue

Required prerequisites:

(1) 1 GPU, nvidia is used in this article, amd should also work, but I haven't tried it. Video memory 6G or above, greater than or equal to 16G is better;

(2) Scientific Internet access;

(3) Windows system. The linux system is also possible, but I haven't tried it completely. This document is based on Windows.

In addition, it is best to know some English, because the current KoboldAI dialogue model supports Chinese very poorly, of course you can use OpenAI's API. It is also possible to use a translation plugin.

This article mainly explains the local deployment of KoboldAI. If you use Google Colab cloud deployment or use OpenAI API, you can refer to this article

https://www.bilibili.com/read/cv22355581?from=search

0. Good English skills can directly read the official tutorial

https://github.com/TavernAI/TavernAI

https://github.com/KoboldAI/KoboldAI-Client

1. Configure Tavern AI

(1) Download and install Node.js

Open the address below

https://nodejs.org/download/release/v19.1.0/

Choose the one ending with x64.msi, open the installer as an administrator after downloading, and press Next until the installation is complete.

(2) Download and install TavernAI

https://sourceforge.net/projects/tavernaimain/files/TavernAI.rar/download

Unzip it after downloading, open the TavernAI.exe inside, the browser will pop up the TavernAI page, the default address is 127.0.0.1:8000.

2. Configure Kobold AI

(1) Download and configuration

https://github.com/KoboldAI/KoboldAI-Client/archive/refs/heads/main.zip

Unzip it, open install_requirements.bat as an administrator, and the interface shown in the figure below appears:

Enter 1 and press Enter, and wait for the long download process. If it fails, it is likely to be a network problem. Try to open (close) Science Internet, and then open install_requirements.bat again with the administrator. At this time, it will ask you whether you want to delete the content that was installed last time. You can choose whichever you want, but you can choose 2 in order to save time, and then press Enter. Then continue with the installation process.

After the configuration is complete, open play.bat, wait for it to load for a while, and the following message will appear, indicating that it has been enabled:

Enter 127.0.0.1:5000 in the browser to enter the UI page of KoboldAI:

(2) Select and load the model

On the UI page of KoboldAI, click AI in the upper left corner, and a list of models will appear:

To be honest, there are too many of these models, and I have not studied them one by one. For the description of some models, refer to

https://github.com/KoboldAI/KoboldAI-Client

The content of the Models the Colab GPU can run, as shown below.

Note that each model has a different style. For example, if you need Adventure, you should choose the one with Adventure in Style.

Pay attention to the size of the model. If you choose a model that is too large, the video memory will burst and cannot be loaded. 6B or 6.7B needs 16G video memory, 2.7B needs 8G video memory, 1.3B needs 6G video memory, and even 125M can be loaded with 2G video memory? As for 13B or 20B...probably only professional cards can run. In fact, it is not necessary to load all the models into video memory, but that will be very slow.

Here I take Erebus-6.7B as an example. First click NSFW Models, then click Erebus 6.7B (NSFW), and then click Load below.

The first time you want to download the model, pay attention to observe the command line of KoboldAI - Server, check the download progress, if you report an error, close the command line and start again. If there is an Http error report, you may have to turn off (or turn on) the scientific gateway.

After downloading, it may use CPU to load the model, if this happens, close KoboldAI and reopen play.bat. Load the model again and an interface similar to the one shown below should appear, in which the first horizontal bar at the bottom selects the layers to be loaded into the video memory, and when it is pulled to 32, all are loaded into the video memory. If it is less than 32, it will be slower during the dialogue (it seems that the smaller the slower?), but this can save some video memory. The second horizontal bar is loaded into the hard disk. If the value of this horizontal bar is greater than 0, it will also be slower.

After loading, you can go to TavernAI to start a conversation. In addition, you can directly enter a paragraph on the KoboldAI page, and then let AI generate novels based on your input.

3. use

The main interface of TavernAI has many character presets made by others, which can be used directly by clicking.

You can also pinch one yourself, which will be discussed in the next section. Click the icon with three horizontal bars in the upper right corner, click Settings in the bar that comes out, and then enter the UI page address of KoboldAI in the previous step in the API url (copy directly from the browser). Click Connect to connect to the KoboldAI model. When the small dot below is green, it means OK. If the dot is red, it means that the KoboldAI model has not been loaded or you have entered the wrong address.

Then click on Characters, a list of characters will appear, and you can see that there are already 3 official preset characters. Click any one to start a conversation, as shown in the figure below

If the AI ​​output is incomplete or keeps outputting the same content during the dialogue, try pressing Enter to see if there is any follow-up output. If there is still a problem, it may be that the video memory is bursting. In this case, you may need to use a smaller one. model or reduce the number of layers loaded into video memory when loading a model. See KoboldAI's command line for specifics.

4. Set the character

After opening the dialogue, there will be an additional tab of the character name on the right, for example, this is Megumin, and various attribute settings of the character will appear after clicking. Click Advanced Edit to display more settings:

These settings can be modified directly, but the dialog must be restarted to take effect. Restart the chat by clicking the icon with three horizontal bars in the upper left corner of the dialog, and then click "Start New Chat":

The previous chapter mentioned that TavernAI provides some character presets, and the following website can also be referred to:

https://booru.plus/+pygmalion

In this website, select the picture of the person you are interested in, click in, click the three dots in the upper right corner, and then click Download original below to download the picture.

Then go to the TavernAI interface, click +Import in Characters, and import the picture you just downloaded:

Then you can click on the tab of the character name, which contains various setting information.

You can set your own characters through the following websites:

https://zoltanai.github.io/character-editor/

Click New character on the right, and several text fields will appear below, and then you can start to write the settings. In Personality, you can write in W++ format, so that the model can better understand your settings. If you don’t know how to write it, you can read what others have written for reference.

https://rentry.org/pygtips

The following web page is an example of more W++ formats.

https://rentry.org/f3a52

Then use { {char}} instead of the character name in Scenario (dialogue scene) and Example Messages (dialogue example) , and use { {user}} to represent yourself. Finally, upload the picture, click Export as Image at the bottom and it will be OK. Then go to the TavernAI interface and click +Import in Characters to import.

5. Some parameters

You can see some adjustable parameters in the Settings tab of TavernAI. These parameters involve some content in the academic field, and I don't really understand it. You can go to the Internet to find relevant information.

Temperature: Increased to make the AI's output more random, and at the same time, it will increase the probability of generating content that has nothing to do with the above or answering questions that are not asked. Turned it down to make the AI's output more stable, but may make the dialogue boring for you.

Repetition Penalty: Increased to reduce repetitive content generation.

As for the parameters in the Master Settings, there are more parameters. I haven’t studied anything else. Let me just talk about Amount of generation and Context Size. The former is the maximum length of each AI output reply; the latter is each time it interacts with the model. , the history dialogue sent to the model + the maximum length set. If you don't know how to adjust it, just keep the default.

references

[1] [AI Character Dialogue] Use KoboldAI to have unlimited conversations with AI catgirls (data privatization/can be NSFW) - 哔哩哔哩(bilibili.com)https://www.bilibili.com/read/cv22355581?from = search

[2]GitHub - KoboldAI/KoboldAI-Client. https://github.com/KoboldAI/KoboldAI-Client

[3]GitHub - TavernAI/TavernAI: Atmospheric Adventure Chat for AI Language Models (KoboldAI, NovelAI, Pygmalion, OpenAI Chatgpt, Gpt-4). https://github.com/TavernAI/TavernAI

[4]AI Character Editor (zoltanai.github.io). https://zoltanai.github.io/character-editor/

[5]W++ Examples (rentry.org). https://rentry.org/pygtips

[6]Making a consistent character and the uses of { {user}} & { {char}} (rentry.org). https://rentry.org/OtherCharAiGuide

[7]Moose's Guide to DIY Roko! (TavernAI/OpenAI) (rentry.org). https://rentry.org/moosetavernai

[8]+pygmalion (booru.plus). https://booru.plus/+pygmalion

Guess you like

Origin blog.csdn.net/qq_61368754/article/details/130132898