Kubeflow
by Community
Machine Learning Toolkit for Kubernetes
OSS
Kubeflow
Added 1 June 2026
Overview
Kubeflow is an open-source ML toolkit that runs on Kubernetes, providing components for building and deploying machine learning workflows. It abstracts Kubernetes complexity to let teams define, train, and serve models as containerized pipelines without managing infrastructure directly.
Best for
Best for
Teams with Kubernetes infrastructure who need to standardize ML workflows across on-prem or multi-cloud environments
Use cases
- Orchestrating multi-step training pipelines across distributed clusters
- Managing model serving and inference at scale on Kubernetes
- Automating hyperparameter tuning and experiment tracking workflows
Notes
Kubeflow is an open-source ML toolkit that runs on Kubernetes, providing components for building and deploying machine learning workflows. It abstracts Kubernetes complexity to let teams define, train, and serve models as containerized pipelines without managing infrastructure directly.
15,700 stars on GitHub. Last updated 2026-05-24. Licensed Apache-2.0.
Use cases
- Orchestrating multi-step training pipelines across distributed clusters
- Managing model serving and inference at scale on Kubernetes
- Automating hyperparameter tuning and experiment tracking workflows
Pros
- Runs on any Kubernetes cluster, avoiding vendor lock-in
- Handles distributed training and serving natively
- Active community with broad ecosystem integration
Cons
- Requires existing Kubernetes expertise to operate effectively
- Steep learning curve for teams new to container orchestration
- Observability tooling is basic compared to managed ML platforms
Indexed from awesome-llmops and enriched against its public facts.
Pros
- Runs on any Kubernetes cluster, avoiding vendor lock-in
- Handles distributed training and serving natively
- Active community with broad ecosystem integration
Cons
- Requires existing Kubernetes expertise to operate effectively
- Steep learning curve for teams new to container orchestration
- Observability tooling is basic compared to managed ML platforms
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.
Docker
Community
The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems
TensorFlow
Community
An Open Source Machine Learning Framework for Everyone
PyTorch
Community
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Argo Workflows
Community
Workflow Engine for Kubernetes
Awesome Argo
Community
A curated list of awesome projects and resources related to Argo (a CNCF graduated project)
Awesome Federated Learning Systems
Community
Federated Learning Systems Paper List
Awesome Production Machine Learning
Community
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Harmonia
Community
Federated Learning Made Easy
JuiceFS
Community
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
Kaito
Community
Kubernetes AI Toolchain Operator
Katib
Community
Automated Machine Learning on Kubernetes
Kedro-Viz
Community
Visualise your Kedro data and machine-learning pipelines and track your experiments.
Kserve
Community
Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes
KubeAI
Community
AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text.
Kueue
Community
Kubernetes-native Job Queueing
Maxim AI
Community
At Maxim AI, we are building the production infrastructure for AI. Maxim’s stack comprising gateway and governance, observability, and evals empowers AI teams to ship agents with
NNI
Community
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Primehub
Community
open-source MLOps platform
Puzzlet AI
Community
Redirecting...
Seldon-core
Community
An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
TFServing
Community
A flexible, high-performance serving system for machine learning models
visenger/awesome-mlops
Community
A curated list of references for MLOps
Volcano
Community
A Cloud Native Batch System (Project under CNCF)
Weco Observe
Community
Build and Optimize your machine learning pipeline with the Weco Platform - based on AIDE ML, the LLM-powered code optimization Agent for Machine Learning Engineering.
Yunikorn
Community
Apache YuniKorn Core
ZenML
Community
ZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io.
Airflow
Community
Platform created by the community to programmatically author, schedule and monitor workflows.
ClearML
Community
ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
Determined
Community
Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch
dstack
Community
Open framework for confidential AI
Flyte
Community
Dynamic, resilient AI orchestration. Coordinate data, models, and compute as you build AI workflows.
KubeAI
Community
AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text.
Metaflow
Community
Build, Manage and Deploy AI/ML Systems
MLRun
Community
MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environ
PAI
Community
Resource scheduling and cluster management for AI
Polyaxon
Community
Open Source AI Infra & Engineering Control Plane
Prefect
Community
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Starwhale
Community
an MLOps/LLMOps platform
VDP
Community
🔮 Instill Core is a full-stack AI infrastructure tool for data, model and pipeline orchestration, designed to streamline every aspect of building versatile AI-first applications
ZenML
Community
ZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io.