OpenModelZ
by Community
Autoscale LLM (vLLM, SGLang, LMDeploy) inferences on Kubernetes (and others)
OSS
OpenModelZ
Added 1 June 2026
Overview
OpenModelZ is an open-source tool that autoscales LLM inference deployments (vLLM, SGLang, LMDeploy) on Kubernetes and other platforms. It monitors inference workloads and adjusts resources to maintain performance while minimizing over-provisioning.
Best for
Best for
Teams operating LLM inference services on Kubernetes who need workload-driven autoscaling.
Use cases
- Automatically scale vLLM deployments based on request load
- Optimize GPU utilization for SGLang inference servers
- Manage dynamic inference capacity for LMDeploy on Kubernetes
Notes
OpenModelZ is an open-source tool that autoscales LLM inference deployments (vLLM, SGLang, LMDeploy) on Kubernetes and other platforms. It monitors inference workloads and adjusts resources to maintain performance while minimizing over-provisioning.
283 stars on GitHub. Last updated 2023-11-03. Licensed Apache-2.0.
Use cases
- Automatically scale vLLM deployments based on request load
- Optimize GPU utilization for SGLang inference servers
- Manage dynamic inference capacity for LMDeploy on Kubernetes
Pros
- Open source with a permissive license (community-driven)
- Written in Go for efficient resource usage and fast startup
- Supports multiple popular LLM serving frameworks
Cons
- Relatively small community (283 stars) may limit support and contributions
- Requires Kubernetes knowledge to deploy and configure
- May not handle complex, multi-model inference scenarios out of the box
Indexed from awesome-llmops and enriched against its public facts.
Pros
- Open source with a permissive license (community-driven)
- Written in Go for efficient resource usage and fast startup
- Supports multiple popular LLM serving frameworks
Cons
- Relatively small community (283 stars) may limit support and contributions
- Requires Kubernetes knowledge to deploy and configure
- May not handle complex, multi-model inference scenarios out of the box
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.