Enterprise DNA

Open Source Alternatives

Open source alternatives to vLLM

Open source alternatives to vLLM, ranked by GitHub stars and freshness.

13 open-source alternatives in the index, ranked by GitHub stars and freshness.

O OSS Framework medium

ollama

Community

Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.

★ 172,846 updated 2d ago
open-source

Best for: Developers building local-first applications or prototyping with open-source LLMs without cloud costs

O OSS Framework medium

llama.cpp

Community

LLM inference in C/C++

★ 114,160 updated 2d ago
open-source

Best for: Developers building privacy-first or offline-capable applications with constrained hardware

O OSS Framework medium

SGLang

Community

SGLang is a high-performance serving framework for large language models and multimodal models.

★ 28,885 updated 2d ago
open-source

Best for: Teams building production LLM services who need performance-optimized serving infrastructure

O OSS Framework medium

TensorRT-LLM

Community

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NV

★ 13,781 updated 2d ago
open-source

Best for: Teams deploying LLMs at scale on NVIDIA infrastructure who need maximum inference performance.

O OSS Obs medium

text-generation-inference

Community

Large Language Model Text Generation Inference

★ 10,857 updated 2mo ago
open-source

Best for: Developers needing a production-grade, self-hosted LLM serving solution.

O OSS Obs medium

Triton Server (TRTIS)

Community

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

★ 10,720 updated 2d ago
open-source

Best for: Teams deploying large-scale inference services that need high throughput and multi-framework support.

O OSS Obs medium

FlexGen

Community

Running large language models on a single GPU for throughput-oriented scenarios.

★ 9,365 updated 1y ago
open-source

Best for: Developers who need to run large language models at high throughput on a single GPU, especially in budget-constrained or research environments

O OSS Framework medium

LMDeploy

Community

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

★ 7,876 updated 2d ago
open-source

Best for: Developers who need to compress and serve LLMs efficiently in production

O OSS Framework medium

mistral.rs

Community

Fast, flexible LLM inference

★ 7,205 updated 2d ago
open-source

Best for: Rust developers seeking a fast, flexible LLM inference framework for performance-critical or resource-constrained environments.

O OSS Framework medium

FasterTransformer

Community

Transformer related optimization, including BERT, GPT

★ 6,418 updated 2y ago
open-source

Best for: Developers seeking maximum inference performance for transformer models on NVIDIA hardware

O OSS Obs medium

ray-llm

Community

RayLLM - LLMs on Ray (Archived). Read README for more info.

★ 1,267 updated 1y ago
open-source

Best for: Developers already using Ray who need legacy code or patterns for running LLMs at scale.

O OSS Framework medium

IntelliServer

Community

AI models as scalable microservices, enabling evaluation of LLMs and offering end-to-end functions such as chatbot, semantic search, image generation and beyond.

★ 29 updated 1y ago
open-source

Best for: JavaScript developers who need a simple microservice wrapper for deploying and evaluating AI models

O OSS Framework medium

TGI

Community

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

open-source

Best for: Developers and teams who need to self-host or fine-tune open-source LLMs at scale