【Stable Diffusion】What are FID, CLIP, and cfg-scales?

In the stable-diffusion warehouse, this is how the model is evaluated.

Evaluations with different classifier-free guidance scales (1.5, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0) and 50 PLMS sampling steps show the relative improvements of the checkpoints,

Corresponds to this picture
Insert image description here

What do the FID score, Clip score and cfg-scales in the picture mean?

FID score

FID (Fréchet Inception Distance) score is a metric used to evaluate the quality of generated images. It is specifically used to evaluate the performance of the model in generating images. The calculation formula is as follows:

FID ( p , q ) = ∣ ∣ μ p − μ q ∣ ∣ 2 2 + T r ( C p + C q − 2 C p C q ) \mathrm{FID}(p, q) = ||\mu_p - \mu_q||_2^2 + \mathrm{Tr}(C_p + C_q - 2\sqrt{C_pC_q})FID(p,q)=∣∣μpmq

Guess you like

Origin blog.csdn.net/luxu1220/article/details/130624336