Introduction

Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input.

V1

V2

Running locally on Mac

Running on the cloud

How does it work?

What is it currently good or bad at?

  • In-painting region on the picture with custom color or image doesn’t work so well with V1 nor V2;

  • It does quite a good job doing poses and generate single person generation when paired controlnet;

Some really interesting applications of the model

Animation with Stable diffusion and Unreal Engine 5

Reactive audio based on the generated image

Image-to-image + voice and video synth

Text-to-video editing

Morphing stable diffusion images with Unreal engine 5 models

VFX “makeup”

Stable diffusion animation