O Open Source Observability medium

Modelz-LLM

by Community

OpenAI compatible API for LLMs and embeddings (LLaMA, Vicuna, ChatGLM and many others)

Visit Community View repo Submit your build →

OSS

Modelz-LLM

Added 1 June 2026

#llm #nlp #openai-api #transformer

Overview

Modelz-LLM provides an OpenAI-compatible API for serving open-source LLMs and embedding models like LLaMA, Vicuna, and ChatGLM. It enables local deployment and seamless integration with existing OpenAI tooling and clients.

Best for

Best for
Developers who want to experiment with open-source LLMs using familiar OpenAI API patterns

Use cases

Run open-source LLMs using OpenAI API patterns without code changes
Generate embeddings for retrieval-augmented generation or similarity search
Swap between multiple models by modifying configuration rather than client code

Notes

277 stars on GitHub. Last updated 2023-10-11. Licensed Apache-2.0.

Use cases

Run open-source LLMs using OpenAI API patterns without code changes
Generate embeddings for retrieval-augmented generation or similarity search
Swap between multiple models by modifying configuration rather than client code

Pros

Drop-in replacement for OpenAI API calls, reducing migration effort
Supports a broad range of open-source models in a single service
Simple Python-based deployment with minimal dependencies

Cons

Community project with modest GitHub stars (277), implying limited support and updates
Categorized as observability, but its core function is model serving rather than monitoring
Documentation and community resources are sparse, increasing troubleshooting time

Indexed from awesome-llmops and enriched against its public facts.

Pros

Drop-in replacement for OpenAI API calls, reducing migration effort
Supports a broad range of open-source models in a single service
Simple Python-based deployment with minimal dependencies

Cons

Community project with modest GitHub stars (277), implying limited support and updates
Categorized as observability, but its core function is model serving rather than monitoring
Documentation and community resources are sparse, increasing troubleshooting time

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Uses1entry

O OSS Framework medium

vLLM

Community

A high-throughput and memory-efficient inference and serving engine for LLMs

★ 81,619 updated 1mo ago

Pairs with1entry

P Apps Productivity low

Open WebUI

Various

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

★ 139,558 updated 1mo ago

Free 27-page guide

Get the free Developer’s Field Guide

A 27-page field guide to the AI coding workflow with Claude. Claude Code, MCP servers, the prompt patterns that work, and what to delegate. Free.

Enter your work email. We send it straight over, plus a few short notes worth knowing. Unsubscribe any time.

← Back to Open Source Submit your own entry →