O Open Source Observability medium

KubeAI

by Community

AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text.

Visit Community View repo Submit your build →

OSS

KubeAI

Added 1 June 2026

#ai #autoscaler #faster-whisper #inference-operator #k8s #kubernetes #llm #ollama

Overview

KubeAI is an open-source Kubernetes operator that deploys and serves ML models including VLMs, LLMs, embeddings, and speech-to-text. It automates model serving on Kubernetes clusters using a custom resource definition and handles scaling, resource allocation, and inference requests.

Best for

Best for
Teams already running Kubernetes who want a straightforward way to serve multiple model types in production.

Use cases

Deploy and serve large language models on existing Kubernetes infrastructure
Run embedding models for vector search pipelines in production
Serve speech-to-text models alongside other AI workloads in a unified cluster

Notes

1,201 stars on GitHub. Last updated 2026-06-01. Licensed Apache-2.0.

Use cases

Deploy and serve large language models on existing Kubernetes infrastructure
Run embedding models for vector search pipelines in production
Serve speech-to-text models alongside other AI workloads in a unified cluster

Pros

Simplifies ML model deployment with native Kubernetes integration
Supports a wide range of model types from a single operator
Active open-source community with over 1,200 GitHub stars

Cons

Requires existing Kubernetes expertise and cluster management
Limited to models that fit the operator’s supported formats
Community-driven project may have slower feature updates than commercial alternatives

Indexed from awesome-llmops and enriched against its public facts.

Pros

Simplifies ML model deployment with native Kubernetes integration
Supports a wide range of model types from a single operator
Active open-source community with over 1,200 GitHub stars

Cons

Requires existing Kubernetes expertise and cluster management
Limited to models that fit the operator's supported formats
Community-driven project may have slower feature updates than commercial alternatives

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Alternative to1entry

O OSS Obs medium

Kubeflow

Community

Machine Learning Toolkit for Kubernetes

★ 15,700 updated 1mo ago

Free 27-page guide

Get the free Developer’s Field Guide

A 27-page field guide to the AI coding workflow with Claude. Claude Code, MCP servers, the prompt patterns that work, and what to delegate. Free.

Enter your work email. We send it straight over, plus a few short notes worth knowing. Unsubscribe any time.

← Back to Open Source Submit your own entry →