Enterprise DNA
O Open Source Observability medium

Modelz-LLM

by Community

OpenAI compatible API for LLMs and embeddings (LLaMA, Vicuna, ChatGLM and many others)

M

OSS

Modelz-LLM

Added 1 June 2026

#llm #nlp #openai-api #transformer

Overview

Modelz-LLM provides an OpenAI-compatible API for serving open-source LLMs and embedding models like LLaMA, Vicuna, and ChatGLM. It enables local deployment and seamless integration with existing OpenAI tooling and clients.

Best for

Best for
Developers who want to experiment with open-source LLMs using familiar OpenAI API patterns

Use cases

  • Run open-source LLMs using OpenAI API patterns without code changes
  • Generate embeddings for retrieval-augmented generation or similarity search
  • Swap between multiple models by modifying configuration rather than client code

Notes

Modelz-LLM provides an OpenAI-compatible API for serving open-source LLMs and embedding models like LLaMA, Vicuna, and ChatGLM. It enables local deployment and seamless integration with existing OpenAI tooling and clients.

277 stars on GitHub. Last updated 2023-10-11. Licensed Apache-2.0.

Use cases

  • Run open-source LLMs using OpenAI API patterns without code changes
  • Generate embeddings for retrieval-augmented generation or similarity search
  • Swap between multiple models by modifying configuration rather than client code

Pros

  • Drop-in replacement for OpenAI API calls, reducing migration effort
  • Supports a broad range of open-source models in a single service
  • Simple Python-based deployment with minimal dependencies

Cons

  • Community project with modest GitHub stars (277), implying limited support and updates
  • Categorized as observability, but its core function is model serving rather than monitoring
  • Documentation and community resources are sparse, increasing troubleshooting time

Indexed from awesome-llmops and enriched against its public facts.

Pros

  • Drop-in replacement for OpenAI API calls, reducing migration effort
  • Supports a broad range of open-source models in a single service
  • Simple Python-based deployment with minimal dependencies

Cons

  • Community project with modest GitHub stars (277), implying limited support and updates
  • Categorized as observability, but its core function is model serving rather than monitoring
  • Documentation and community resources are sparse, increasing troubleshooting time