Directories / Alternatives / vLLM

Open Source Alternatives

Open source alternatives to vLLM

Open source alternatives to vLLM, ranked by GitHub stars and freshness.

9 open-source alternatives in the index, ranked by GitHub stars and freshness.

O OSS Framework medium

SGLang

Community

SGLang is a high-performance serving framework for large language models and multimodal models.

★ 28,885 updated 1mo ago

open-source

Best for: Teams building production LLM services who need performance-optimized serving infrastructure

O OSS Framework medium

TensorRT-LLM

Community

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NV

★ 13,781 updated 1mo ago

open-source

Best for: Teams deploying LLMs at scale on NVIDIA infrastructure who need maximum inference performance.

O OSS Obs medium

text-generation-inference

Community

Large Language Model Text Generation Inference

★ 10,857 updated 3mo ago

open-source

Best for: Developers needing a production-grade, self-hosted LLM serving solution.

O OSS Framework medium

LMDeploy

Community

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

★ 7,876 updated 1mo ago

open-source

Best for: Developers who need to compress and serve LLMs efficiently in production

O OSS Framework medium

mistral.rs

Community

Fast, flexible LLM inference

★ 7,205 updated 1mo ago

open-source

Best for: Rust developers seeking a fast, flexible LLM inference framework for performance-critical or resource-constrained environments.

O OSS Framework medium

FasterTransformer

Community

Transformer related optimization, including BERT, GPT

★ 6,418 updated 2y ago

open-source

Best for: Developers seeking maximum inference performance for transformer models on NVIDIA hardware

O OSS Obs medium

Shimmy

Community

⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.

★ 5,306 updated 1mo ago

open-source

Best for: Developers seeking a free, no-fuss Rust-based inference server with OpenAI API compatibility

O OSS Framework medium

Text-Embeddings-Inference

Community

A blazing fast inference solution for text embeddings models

★ 4,829 updated 1mo ago

open-source

Best for: Developers who need fast, scalable embedding serving for search or NLP pipelines

O OSS Framework medium

TGI

Community

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

open-source

Best for: Developers and teams who need to self-host or fine-tune open-source LLMs at scale