this post was submitted on 04 Dec 2025
7 points (100.0% liked)

Stable Diffusion

5179 readers
11 users here now

Discuss matters related to our favourite AI Art generation technology

Also see

Other communities

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] Even_Adder@lemmy.dbzer0.com 3 points 2 days ago* (last edited 2 days ago) (1 children)

It's basically when you use a larger model to train a smaller one. You use a dataset of data generated by the teacher model and ground truth data to train the student model, and by some strange alchemy I don't quite understand you get a much smaller model that resembles the teacher model.

It's really hard training on a distilled model without breaking it, so people prefer models undistilled whenever possible. Without the teacher model, distilled models are basically cripple-ware.

Thanks for explaining!