Jwrede/llmprobe
by Various
Synthetic monitoring and CI smoke tests for LLM inference endpoints.
MCP
Jwrede/llmprobe
Added 1 June 2026
Overview
Jwrede/llmprobe is a Go tool for synthetic monitoring and CI smoke tests of LLM inference endpoints. It sends predefined prompts to an endpoint and checks the responses against expected patterns or latency thresholds.
Best for
Best for
Developers who need a minimal, scriptable smoke test for LLM endpoints in CI or monitoring pipelines.
Use cases
- Verify an LLM endpoint is responding correctly after a deployment
- Run periodic health checks on production inference services
- Integrate into CI pipelines to catch regressions before release
Notes
Jwrede/llmprobe is a Go tool for synthetic monitoring and CI smoke tests of LLM inference endpoints. It sends predefined prompts to an endpoint and checks the responses against expected patterns or latency thresholds.
1 stars on GitHub. Last updated 2026-05-16. Licensed MIT.
Use cases
- Verify an LLM endpoint is responding correctly after a deployment
- Run periodic health checks on production inference services
- Integrate into CI pipelines to catch regressions before release
Pros
- Lightweight Go binary with no external dependencies
- Simple configuration for quick setup
- Designed specifically for LLM endpoints, not generic HTTP monitoring
Cons
- Very early stage with only 1 GitHub star and limited community adoption
- No built-in support for complex test scenarios or multi-step workflows
- Lacks alerting or dashboard features out of the box
Indexed from awesome-mcp-servers-punkpeye and enriched against its public facts.
Pros
- Lightweight Go binary with no external dependencies
- Simple configuration for quick setup
- Designed specifically for LLM endpoints, not generic HTTP monitoring
Cons
- Very early stage with only 1 GitHub star and limited community adoption
- No built-in support for complex test scenarios or multi-step workflows
- Lacks alerting or dashboard features out of the box
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.
vLLM
Community
A high-throughput and memory-efficient inference and serving engine for LLMs
ollama
Community
Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.
llama.cpp
Community
LLM inference in C/C++
LiteLLM 🚅
Community
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, Vertex