ControlNet 1.1 is released, all 14 models are open source!

Source: https://github.com/lllyasviel/ControlNet-v1-1-nightly
ControlNet 1.1 has the exact same architecture as ControlNet 1.0, ControlNet 1.1 includes all previous models with improved robustness and result quality, and adds several new models.

Model naming convention update

Starting with ControlNet 1.1, we started naming all models using Standard ControlNet Naming Rules (SCNNRs), which the authors hope will improve the user experience.


ControlNet 1.1 includes 14 models (11 production-ready, 2 experimental, and 1 incomplete):

control_v11p_sd15_canny
control_v11p_sd15_mlsd
control_v11f1p_sd15_depth
control_v11p_sd15_normalbae
control_v11p_sd15_seg
control_v11p_sd15_inpaint
control_v11p_sd15_lineart
control_v11p_sd15s2_lineart_anime
control_v11p_sd15_openpose
control_v11p_sd15_scribble
control_v11p_sd15_softedge
control_v11e_sd15_shuffle
control_v11e_sd15_ip2p
control_v11u_sd15_tile

Model download address: https://huggingface.co/lllyasviel/ControlNet-v1-1/tree/main

ControlNet 1.1 Depth

Model file:

Model file: control_v11f1p_sd15_depth.pth
Config file: control_v11f1p_sd15_depth.yaml

Model effect:

Depth1.1 improvements:

  1. There were several problems with the training dataset of cnet 1.0 before, including (1) a small number of grayscale portraits were copied thousands of times (!!), which made the previous model somewhat likely to generate grayscale portraits; (2) some images Low quality, very blurry, or obvious JPEG artifacts; (3) A small number of images have pairing prompt errors due to errors in data processing scripts. The new model fixes all the issues with the training dataset and should be more reasonable in many cases.
  2. The new deep model is a relatively unbiased model. It is not trained at a certain type of depth by a certain depth estimation method. It doesn't overfit a preprocessor. This means that the model will work better with different depth estimates, different preprocessor resolutions, or even the true depth created by the 3D engine.
  3. Some reasonable data augmentation is applied for training, such as random left and right flips.
  4. The model is restored from depth 1.0 and it should work well in all cases where depth 1.0 worked well. Depth 1.1 works well in many cases where depth 1.0 fails.
  5. If using Midas Depth ("Depth" in the webui plugin) with a 384 preprocessor resolution, the difference between depth 1.0 and 1.1 should be minimal. However, if you try other preprocessor resolutions or other preprocessors like leres and zoe, depth 1.1 is expected to be a bit better than 1.0.

ControlNet 1.1 Normal

Model file:

Model file: control_v11p_sd15_normalbae.pth
Config file: control_v11p_sd15_normalbae.yaml



Improvements in Normal 1.1:

  1. The normal-from-midas approach in Normal 1.0 is neither reasonable nor correct. This method does not work well in many images. The normal 1.0 model cannot interpret the real normal maps created by the rendering engine.
  2. This Normal 1.1 is more reasonable, because the preprocessor is trained to use a relatively correct protocol (NYU-V2's visualization method) to estimate the normal map. This means that Normals 1.1 can interpret the real normal map from the rendering engine, as long as the colors are correct (blue on the front, red on the left, green on top).
  3. In tests, this model is more robust and can achieve similar performance to deep models. In the previous CNET 1.0, Normal 1.0 was not very commonly used. But this Normal 2.0 has been greatly improved and has the potential to be used more often.

ControlNet 1.1 Canny

Model file:

Model file: control_v11p_sd15_canny.pth
Config file: control_v11p_sd15_canny.yaml


Canny 1.1 improvements:

  1. There were several problems with the training dataset of cnet 1.0 before, including (1) a small number of grayscale portraits were copied thousands of times (!!), which made the previous model somewhat likely to generate grayscale portraits; (2) some images Low quality, very blurry, or obvious JPEG artifacts; (3) A small number of images have pairing errors due to errors in our data processing scripts. The new model fixes all the issues with the training dataset and should be more reasonable in many cases.
  2. Because the Canny model is one of the most important (perhaps most commonly used) ControlNet models, we used a fund to train for 3 days on a machine with 8 Nvidia A100 80G batchsize 8×32=256, costing 72×30 = $2160 (8 A100 80G at $30 per hour). This model is restored from Canny 1.0.
  3. Some reasonable data augmentation is applied for training, such as random left and right flips.
  4. Although it is difficult to evaluate ControlNet, we found Canny 1.1 to be more robust and of higher visual quality than Canny 1.0.

ControlNet 1.1 MLSD

Model file: control_v11p_sd15_mlsd.pth
Config file: control_v11p_sd15_mlsd.yaml


Improvements in MLSD 1.1:

  1. There were several problems with the training dataset of cnet 1.0 before, including (1) a small number of grayscale portraits were copied thousands of times (!!), which made the previous model somewhat likely to generate grayscale portraits; (2) some images Low quality, very blurry, or obvious JPEG artifacts; (3) Due to errors in our data processing 3. scripts, a small number of images have pairing prompt errors. The new model fixes all the issues with the training dataset and should be more reasonable in many cases.
  2. The training dataset was enlarged by using MLSD to find images with more than 16 lines in them, adding 300K images.
  3. Some reasonable data augmentation is applied for training, such as random left and right flips.
  4. Resume from MLSD 1.0 and continue training with 200 GPU hours on A100 80G.

ControlNet 1.1 Scribble

Model file: control_v11p_sd15_scribble.pth
Config file: control_v11p_sd15_scribble.yaml



Improvements in Scribble 1.1:

  1. There were several problems with the training dataset of cnet 1.0 before, including (1) a small number of grayscale portraits were copied thousands of times (!!), which made the previous model somewhat likely to generate grayscale portraits; (2) some images Low quality, very blurry, or obvious JPEG artifacts; (3) A small number of images have pairing errors due to errors in our data processing scripts. The new model fixes all the issues with the training dataset and should be more reasonable in many cases.
  2. It is found that users sometimes like to draw very thick graffiti. Therefore, we use a more aggressive random morphological transformation to synthesize graffiti. Even with relatively thick scribbles (the max width of the training data is 24 pixel wide scribbles in 512 canvases, it seems to work fine even for wider scribbles; with a minimum width of 1 pixel, the model should work fine too ).
  3. Restored from Scribble 1.0, continued with 200 GPU hours on A100 80G.

ControlNet 1.1 Soft Edge

Model file:

Model file: control_v11p_sd15_softedge.pth
Config file: control_v11p_sd15_softedge.yaml

Model Effects:
New in ControlNet 1.1: We have now added a new type of soft edge called "SoftEdge_safe". This is because HED or PIDI tend to hide the corrupted grayscale version of the original image in the soft estimation, and this hidden mode can distract ControlNet, leading to bad results. The solution is to use preprocessing to quantize the edge map to multiple levels in order to completely remove hidden patterns


Improvements in Soft Edge 1.1:

  1. Soft Edge 1.1 was previously known as HED 1.0 in ControlNet.
  2. There were several problems with the training dataset of cnet 1.0 before, including (1) a small number of grayscale portraits were copied thousands of times (!!), which made the previous model somewhat likely to generate grayscale portraits; (2) some images Low quality, very blurry, or obvious JPEG artifacts; (3) A small number of images have pairing errors due to errors in our data processing scripts. The new model fixes all the issues with the training dataset and should be more reasonable in many cases.
  3. Soft Edge 1.1 is significantly (almost 100% of the time) better than HED 1.0. This is mainly because HED or PIDI estimators tend to hide grayscale versions of corrupted original images in soft-edge maps, while the previous model, HED 1.0, overfits to recover hidden corrupted images instead of performing boundary-aware diffusion. Soft Edge 1.1 training used a 75% "safe" filter to remove such hidden corrupted grayscale image internal control maps. This makes Soft Edge 1.1 very powerful. In real-world testing, Soft Edge 1.1 was just as usable as the deep model, and is likely to be used more often.

ControlNet 1.1 Segmentation

Model file:

Model file: control_v11p_sd15_seg.pth
Config file: control_v11p_sd15_seg.yaml

Model Effects:


Segmentation 1.1: Improvements:

  1. Support COCO protocol. The previous Segmentation 1.0 supports about 150 colors, but Segmentation 1.1 supports another 182 colors of coco.
  2. Reverted from segment 1.0. All previous entries should still work.

ControlNet 1.1 Openpose

Model file:

Model file: control_v11p_sd15_openpose.pth
Config file: control_v11p_sd15_openpose.yaml

Model Effects:


Improvements from Openpose 1.1:

  1. The improvement of this model is mainly based on our improved implementation of OpenPose. We carefully reviewed the difference between pytorch's OpenPose and CMU's c++ openpose. The processor should be more accurate now, especially with hands. Processor improvements led to improvements in Openpose 1.1.
  2. Support for more inputs (hands and faces).
  3. There were several problems with the training dataset of cnet 1.0 before, including (1) a small number of grayscale portraits were copied thousands of times (!!), which made the previous model somewhat likely to generate grayscale portraits; (2) some images Low quality, very blurry, or obvious JPEG artifacts; (3) A small number of images have pairing errors due to errors in our data processing scripts. The new model fixes all the issues with the training dataset and should be more reasonable in many cases.

ControlNet 1.1 Lineart

model file

Model file: control_v11p_sd15_lineart.pth
Config file: control_v11p_sd15_lineart.yaml

Model effect:


ControlNet 1.1 Anime Lineart

model file

Model file: control_v11p_sd15s2_lineart_anime.pth
Config file: control_v11p_sd15s2_lineart_anime.yaml

Model effect:

ControlNet 1.1 Shuffle

ControlNet 1.1 Instruct Pix2Pix

Model file:

Model file: control_v11e_sd15_ip2p.pth
Config file: control_v11e_sd15_ip2p.yaml

Model effect:


ControlNet 1.1 Inpaint

Model file:

Model file: control_v11p_sd15_inpaint.pth
Config file: control_v11p_sd15_inpaint.yaml

Model effect:

ControlNet 1.1 Tile (Unfinished)


Guess you like

Origin blog.csdn.net/yanqianglifei/article/details/130175604