Enterprise DNA
O Open Source Observability medium

OpenModelZ

by Community

Autoscale LLM (vLLM, SGLang, LMDeploy) inferences on Kubernetes (and others)

O

OSS

OpenModelZ

Added 1 June 2026

#cluster-manager #hacktoberfest #inference #llm #llmops #mlops

Overview

OpenModelZ is an open-source tool that autoscales LLM inference deployments (vLLM, SGLang, LMDeploy) on Kubernetes and other platforms. It monitors inference workloads and adjusts resources to maintain performance while minimizing over-provisioning.

Best for

Best for
Teams operating LLM inference services on Kubernetes who need workload-driven autoscaling.

Use cases

  • Automatically scale vLLM deployments based on request load
  • Optimize GPU utilization for SGLang inference servers
  • Manage dynamic inference capacity for LMDeploy on Kubernetes

Notes

OpenModelZ is an open-source tool that autoscales LLM inference deployments (vLLM, SGLang, LMDeploy) on Kubernetes and other platforms. It monitors inference workloads and adjusts resources to maintain performance while minimizing over-provisioning.

283 stars on GitHub. Last updated 2023-11-03. Licensed Apache-2.0.

Use cases

  • Automatically scale vLLM deployments based on request load
  • Optimize GPU utilization for SGLang inference servers
  • Manage dynamic inference capacity for LMDeploy on Kubernetes

Pros

  • Open source with a permissive license (community-driven)
  • Written in Go for efficient resource usage and fast startup
  • Supports multiple popular LLM serving frameworks

Cons

  • Relatively small community (283 stars) may limit support and contributions
  • Requires Kubernetes knowledge to deploy and configure
  • May not handle complex, multi-model inference scenarios out of the box

Indexed from awesome-llmops and enriched against its public facts.

Pros

  • Open source with a permissive license (community-driven)
  • Written in Go for efficient resource usage and fast startup
  • Supports multiple popular LLM serving frameworks

Cons

  • Relatively small community (283 stars) may limit support and contributions
  • Requires Kubernetes knowledge to deploy and configure
  • May not handle complex, multi-model inference scenarios out of the box