Cross Attention Mechanism
Daniel explains the three main components of stable diffusion: text encoder, auto encoder, and diffusion model. Cross attention combines text representation with random noise, guiding the diffusion model to produce semantically relevant images based on the input text.In this clip
From this podcast

Practical AI
Stable Diffusion
Related Questions