Stable Diffusion Turbo – Comfy UI Workflows


in ,

Recently Stability AI released two new models called SDXL Turbo and SD Turbo, these are designed to drastically reduce the number of steps required to as low as one step, this allows for rapid testing and real time applications, SD Turbo is specifically trained on SD 2.1 which is a bit of a poor choice in my opinion given the vast majority of users and content are for SD 1.5 and SDXL, most people skipped SD 2.1 entirely including myself, in this article I’m going to cover a number of ways you can make use of it, you may have also heard of another method to reduce steps needed known as LCM, in comparison Turbo is essentially always better and maintains higher quality.

As a caveat the quality of these turbo models is lower, both of them are trained to output 512×512, but can be pushed as high as 640×640, there are ways around this which I will cover later in this article, but you should consider turbo to be more of a very useful tool than an option for directly producing high quality images.

Ultra Fast

For the fastest possible generation you’re going to need the following workflow:

Naturally this workflow sacrifices quality for speed allowing one step generation, this works great for prompt testing and real time drawing with control nets (example at end), to maximize speed use the taesd or taesdxl VAE decoder (place in models/vae_approx), you may also want to launch Comfy UI with –gpu-only to avoid any model unloading, steps should not exceed 4 and CFG not above 1.5, due to the very low CFG negative prompt is practically useless.

Model Merge and Turbo LoRA

One thing that may be a deal breaker for you is being stuck with the rather limited base SD training, fortunately you can merge turbo with other models and still get most of the benefit, a number of merged models are already available ready to use on popular model sharing websites such as Civitai, you can also do this yourself with the Model Merge Simple and Model Merge Blocks nodes, you may need to play around with the ratio a bit, you should use a regular sampler and aim for 4-12 steps with CFG up to 2.5, you can also push the resolution more doing this with 1024×768 being possible with an SDXL merge, however turbo does still have somewhat of an influence on generated images.

To remove the need to merge the models and reduce the influence of turbo there is a recently released Turbo LoRA by shiroppo which works very well, use this with the model only LoRA loader with a weight of 0.25 to 0.5.

Further improvement to quality can be had by using FreeU_V2 (already integrated in Comfy UI), this also significantly improves the contrast and colour which is desirable with the more limited CFG range, the end result is you can get reasonably good quality images in as low as 6 steps which is a massive time saving particularly if you’re using a lower end GPU, you can further enhance these images with a regular image to image workflow making this very efficient.


Here is a few samples of what can be achieved, these were all made with Juggernaut V7 + Turbo LoRA 0.4 + FreeU_V2, clip skip 1, 12 steps, CFG 1.2, 768×1024.