Cloud server deployment stable diffusion webui

Some process + billions of pitfall records

It's all because my graphics card is too bad, and I happen to have a rented server for the remaining deep learning platform. It's not necessary to be so troublesome if I just use it, but the training requires video memory and I don't plan to run around forever with a small laptop.

The current time is 2022.11.12, and later and previous versions may not be applicable.

The processes I use

Create environment (linux)

Generally, rented servers will come with some configurations, and it seems that you can configure a little less, but my lesson is that it is more worry-free to create a clean virtual environment from scratch.

conda create -n sd python=3.8 
conda activate sd

Regarding conda, pip and other things, if your platform does not come with it, there should be a basic tutorial. There should be corresponding practices for each platform on how to persist your virtual environment.

Then there is the instruction given by the official website:

sudo apt install wget git python3 python3-venv

Create a new folder, cd into the folder, and clone the webui :

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git

The weights are placed under stable-diffusion-webui/models.

If it goes well, enter stable-diffusion-webui, and you can follow the official one-step process at this time.

bash <(wget -qO- https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh)

If it doesn’t go well, you can see if there is anything similar in the stepping pit. Anyway, I finally gave up this command and dismantled it step by step. You can see the part that comes from requirements.txt later.

Deepbooru installation

Enter the stable-diffusion-webui/models/deepbooru folder. This is for automatic labeling of training pictures (two-dimensional), and you can ignore it if you don’t fine-tune the two-dimensional picture. I forgot that the official tensorflow is in the requirements of webui. It still has to be installed separately, and there is no problem with this.

git clone https://github.com/KichangKim/DeepDanbooru.git
//如果要自己安装tensorflow
pip install tensorflow
pip install tensorflow-io

According to requirements.txt

Since my official instruction did not go well, under the webui folder,

pip install -r requirements.txt

Here, including the previous instructions, you may encounter the problem that the version in the mirror source of the server configuration does not keep up with the requirements and the installation fails. Simply put, the new version in the txt cannot be downloaded. The easiest solution is to check where the latest version is in the source (usually it will be reported when an error is reported), and then change the version number in txt to the version you can download.

Start executing after finishing

python launch.py --share --listen --port 7860 --deepdanbooru 

It should be possible to write any port port, maybe some relatively small ones need sudo, 7860 is the default, and share is because you can’t expect to find a browser on the cloud server. Anyway, if you report an error saying something like localhost and then open share, click this. , If the server in the teaching and research section wants to share it, I guess it needs to be fully penetrated. Then copy it to your share URL and paste it in. The part of deepdanbooru is for training, just generate a picture and don't care.

More pitfalls

Configure the environment section

If there is "no matches found: httpx[socks]" in the error:

Note that the single quotes are escaped by [], the pip install httpx[socks] in its prompt is not acceptable, it must be like this:

pip install 'httpx[socks]'

training part

This is very strange. I encountered the problem that the embedding can be generated but the interface cannot be refreshed or loaded. I forgot what the error is. Anyway, follow its prompt and go to share.py under stable-diffusion-webui/modules In, look for --disable-safe-unpickle, and write the default value as true.

parser.add_argument("--disable-safe-unpickle", action='store_true', help="disable checking pytorch models for malicious code", default=True)

If this is the solution, the pt file after the training must be used in other places and this must be changed.

If the training fails to start, and there is cpu in the error report, use the method of this big guy , stable-diffusion-webui/repositories/stable-diffusion/ldm/models/diffusion, enter ddpm.py, line 1030 Add above

t = t.to('cpu')

Finally: Just talking about tossing on the server, in fact, it may be more convenient to use the original version without the webui... The training parameters and the differences between different methods are still being tested.

Guess you like

Origin blog.csdn.net/A4paper/article/details/127817817