O Open Source Observability medium

Mosec

by Community

A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

Visit Community View repo Submit your build →

OSS

Mosec

Added 1 June 2026

#cv #deep-learning #gpu #hacktoberfest #jax #llm #llm-serving #machine-learning

Overview

Mosec is a high-performance ML model serving framework that supports dynamic batching and CPU/GPU pipelines. It is written in Python and designed to maximize hardware utilization for inference workloads.

Best for

Best for
ML engineers deploying custom Python models who need efficient batching and mixed CPU/GPU pipelines

Use cases

Deploying custom machine learning models in production with automatic batching
Running inference pipelines that require both CPU and GPU stages
Serving models with variable request sizes to optimize throughput

Notes

899 stars on GitHub. Last updated 2026-06-01. Licensed Apache-2.0.

Use cases

Deploying custom machine learning models in production with automatic batching
Running inference pipelines that require both CPU and GPU stages
Serving models with variable request sizes to optimize throughput

Pros

Dynamic batching adapts to request load for better resource use
Supports hybrid CPU/GPU pipelines without extra orchestration
Native Python integration simplifies workflow for Python‑based ML teams

Cons

Smaller community and ecosystem compared to established serving frameworks
Limited support for non-Python model formats and clients
Production hardening and monitoring features are less mature than alternatives like Triton or Ray Serve

Indexed from awesome-llmops and enriched against its public facts.

Pros

Dynamic batching adapts to request load for better resource use
Supports hybrid CPU/GPU pipelines without extra orchestration
Native Python integration simplifies workflow for Python‑based ML teams

Cons

Smaller community and ecosystem compared to established serving frameworks
Limited support for non-Python model formats and clients
Production hardening and monitoring features are less mature than alternatives like Triton or Ray Serve

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Built with1entry

O OSS Obs medium

PyTorch

Community

Tensors and Dynamic neural networks in Python with strong GPU acceleration

★ 100,318 updated 1mo ago

Pairs with1entry

O OSS Framework medium

vLLM

Community

A high-throughput and memory-efficient inference and serving engine for LLMs

★ 81,619 updated 1mo ago

Free 27-page guide

Get the free Developer’s Field Guide

A 27-page field guide to the AI coding workflow with Claude. Claude Code, MCP servers, the prompt patterns that work, and what to delegate. Free.

Enter your work email. We send it straight over, plus a few short notes worth knowing. Unsubscribe any time.

← Back to Open Source Submit your own entry →