Mosec
by Community
A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine
OSS
Mosec
Added 1 June 2026
Overview
Mosec is a high-performance ML model serving framework that supports dynamic batching and CPU/GPU pipelines. It is written in Python and designed to maximize hardware utilization for inference workloads.
Best for
Best for
ML engineers deploying custom Python models who need efficient batching and mixed CPU/GPU pipelines
Use cases
- Deploying custom machine learning models in production with automatic batching
- Running inference pipelines that require both CPU and GPU stages
- Serving models with variable request sizes to optimize throughput
Notes
Mosec is a high-performance ML model serving framework that supports dynamic batching and CPU/GPU pipelines. It is written in Python and designed to maximize hardware utilization for inference workloads.
899 stars on GitHub. Last updated 2026-06-01. Licensed Apache-2.0.
Use cases
- Deploying custom machine learning models in production with automatic batching
- Running inference pipelines that require both CPU and GPU stages
- Serving models with variable request sizes to optimize throughput
Pros
- Dynamic batching adapts to request load for better resource use
- Supports hybrid CPU/GPU pipelines without extra orchestration
- Native Python integration simplifies workflow for Python‑based ML teams
Cons
- Smaller community and ecosystem compared to established serving frameworks
- Limited support for non-Python model formats and clients
- Production hardening and monitoring features are less mature than alternatives like Triton or Ray Serve
Indexed from awesome-llmops and enriched against its public facts.
Pros
- Dynamic batching adapts to request load for better resource use
- Supports hybrid CPU/GPU pipelines without extra orchestration
- Native Python integration simplifies workflow for Python‑based ML teams
Cons
- Smaller community and ecosystem compared to established serving frameworks
- Limited support for non-Python model formats and clients
- Production hardening and monitoring features are less mature than alternatives like Triton or Ray Serve
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.