x-stable-diffusion
by Community
Real-time inference for Stable Diffusion - 0.88s latency. Covers AITemplate, nvFuser, TensorRT, FlashAttention. Join our Discord communty: https://discord.com/invite/TgHXuSJEk6
OSS
x-stable-diffusion
Added 1 June 2026
Overview
x-stable-diffusion provides real-time inference for Stable Diffusion with a reported latency of 0.88 seconds. It leverages optimizations including AITemplate, nvFuser, TensorRT, and FlashAttention to accelerate model execution on compatible hardware.
Best for
Best for
Developers and researchers optimizing Stable Diffusion for low-latency inference on NVIDIA hardware.
Use cases
- Deploying Stable Diffusion for near-real-time image generation tasks
- Benchmarking inference performance across different optimization backends
- Experimenting with accelerated attention and template-based compilation
Notes
x-stable-diffusion provides real-time inference for Stable Diffusion with a reported latency of 0.88 seconds. It leverages optimizations including AITemplate, nvFuser, TensorRT, and FlashAttention to accelerate model execution on compatible hardware.
560 stars on GitHub. Last updated 2023-12-04. Licensed Apache-2.0.
Use cases
- Deploying Stable Diffusion for near-real-time image generation tasks
- Benchmarking inference performance across different optimization backends
- Experimenting with accelerated attention and template-based compilation
Pros
- Achieves very low inference latency (0.88s) through combined GPU optimizations
- Integrates multiple state-of-the-art optimization techniques in one repository
- Open source with an active Discord community for support and updates
Cons
- Primarily targets NVIDIA GPUs due to reliance on CUDA-based libraries
- Requires manual setup and configuration of each optimization backend
- Limited documentation beyond the README and community Discord
Indexed from awesome-llmops and enriched against its public facts.
Pros
- Achieves very low inference latency (0.88s) through combined GPU optimizations
- Integrates multiple state-of-the-art optimization techniques in one repository
- Open source with an active Discord community for support and updates
Cons
- Primarily targets NVIDIA GPUs due to reliance on CUDA-based libraries
- Requires manual setup and configuration of each optimization backend
- Limited documentation beyond the README and community Discord
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.