O Open Source Observability medium

x-stable-diffusion

by Community

Real-time inference for Stable Diffusion - 0.88s latency. Covers AITemplate, nvFuser, TensorRT, FlashAttention. Join our Discord communty: https://discord.com/invite/TgHXuSJEk6

Visit Community View repo Submit your build →

OSS

x-stable-diffusion

Added 1 June 2026

#aitemplate #automl #cuda #docker #inference #notebook #nvfuser #onnx

Overview

x-stable-diffusion provides real-time inference for Stable Diffusion with a reported latency of 0.88 seconds. It leverages optimizations including AITemplate, nvFuser, TensorRT, and FlashAttention to accelerate model execution on compatible hardware.

Best for

Best for
Developers and researchers optimizing Stable Diffusion for low-latency inference on NVIDIA hardware.

Use cases

Deploying Stable Diffusion for near-real-time image generation tasks
Benchmarking inference performance across different optimization backends
Experimenting with accelerated attention and template-based compilation

Notes

560 stars on GitHub. Last updated 2023-12-04. Licensed Apache-2.0.

Use cases

Deploying Stable Diffusion for near-real-time image generation tasks
Benchmarking inference performance across different optimization backends
Experimenting with accelerated attention and template-based compilation

Pros

Achieves very low inference latency (0.88s) through combined GPU optimizations
Integrates multiple state-of-the-art optimization techniques in one repository
Open source with an active Discord community for support and updates

Cons

Primarily targets NVIDIA GPUs due to reliance on CUDA-based libraries
Requires manual setup and configuration of each optimization backend
Limited documentation beyond the README and community Discord

Indexed from awesome-llmops and enriched against its public facts.

Pros

Achieves very low inference latency (0.88s) through combined GPU optimizations
Integrates multiple state-of-the-art optimization techniques in one repository
Open source with an active Discord community for support and updates

Cons

Primarily targets NVIDIA GPUs due to reliance on CUDA-based libraries
Requires manual setup and configuration of each optimization backend
Limited documentation beyond the README and community Discord

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Uses1entry

O OSS Obs medium

stable-diffusion

Community

A latent text-to-image diffusion model

★ 73,065 updated 2y ago

Built with1entry

O OSS Obs medium

PyTorch

Community

Tensors and Dynamic neural networks in Python with strong GPU acceleration

★ 100,318 updated 1mo ago

Free 27-page guide

Get the free Developer’s Field Guide

A 27-page field guide to the AI coding workflow with Claude. Claude Code, MCP servers, the prompt patterns that work, and what to delegate. Free.

Enter your work email. We send it straight over, plus a few short notes worth knowing. Unsubscribe any time.

← Back to Open Source Submit your own entry →