O Open Source Frameworks medium

GPUStack

by Community

A GPU cluster manager that configures and orchestrates inference engines like vLLM and SGLang for high-performance AI model deployment.

Visit Community View repo Submit your build →

OSS

GPUStack

Added 1 June 2026

#ascend #cuda #deepseek #distributed-inference #genai #high-performance-inference #inference #llama

Overview

GPUStack is an open-source GPU cluster manager that configures and orchestrates inference engines such as vLLM and SGLang. It handles resource allocation and scheduling across multiple GPUs to enable high-performance deployment of AI models.

Best for

Best for
Teams needing to manage and scale AI model inference across multiple GPUs

Use cases

Deploying large language models across a multi-GPU cluster
Managing inference engine configurations for vLLM or SGLang
Scaling model serving with automatic GPU resource scheduling

Notes

5,082 stars on GitHub. Last updated 2026-06-01. Licensed Apache-2.0.

Use cases

Deploying large language models across a multi-GPU cluster
Managing inference engine configurations for vLLM or SGLang
Scaling model serving with automatic GPU resource scheduling

Pros

Open-source with active community support
Supports popular inference engines out of the box
Simplifies cluster management for GPU workloads

Cons

Requires familiarity with GPU cluster administration
Limited to inference engine orchestration, not training
Documentation may be less comprehensive than commercial alternatives

Indexed from awesome-llm and enriched against its public facts.

Pros

Open-source with active community support
Supports popular inference engines out of the box
Simplifies cluster management for GPU workloads

Cons

Requires familiarity with GPU cluster administration
Limited to inference engine orchestration, not training
Documentation may be less comprehensive than commercial alternatives

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Uses2entries

O OSS Framework medium

vLLM

Community

A high-throughput and memory-efficient inference and serving engine for LLMs

★ 81,619 updated 1mo ago

O OSS Framework medium

SGLang

Community

SGLang is a high-performance serving framework for large language models and multimodal models.

★ 28,885 updated 1mo ago

Pairs with2entries

O OSS Framework medium

vLLM

Community

A high-throughput and memory-efficient inference and serving engine for LLMs

★ 81,619 updated 1mo ago

O OSS Framework medium

SGLang

Community

SGLang is a high-performance serving framework for large language models and multimodal models.

★ 28,885 updated 1mo ago

Free 27-page guide

Get the free Developer’s Field Guide

A 27-page field guide to the AI coding workflow with Claude. Claude Code, MCP servers, the prompt patterns that work, and what to delegate. Free.

Enter your work email. We send it straight over, plus a few short notes worth knowing. Unsubscribe any time.

← Back to Open Source Submit your own entry →