Enterprise DNA
M MCP Servers Developer low

iris-eval/mcp-server

by Various

The agent eval standard for MCP — score output quality, catch safety failures, enforce cost budgets

I

MCP

iris-eval/mcp-server

Added 1 June 2026

#agent-evaluation #ai-agent #claude #eval #evaluation #llm #mcp #mcp-server

Overview

iris-eval/mcp-server provides a standardized evaluation framework for agents using the Model Context Protocol (MCP). It scores output quality, detects safety failures, and enforces cost budgets to help developers assess and control agent behavior.

Best for

Best for
Developers building and evaluating agents that use the Model Context Protocol

Use cases

  • Benchmark agent output quality against defined criteria
  • Automatically catch safety violations during agent execution
  • Enforce per-call or cumulative cost limits to prevent budget overruns

Notes

iris-eval/mcp-server provides a standardized evaluation framework for agents using the Model Context Protocol (MCP). It scores output quality, detects safety failures, and enforces cost budgets to help developers assess and control agent behavior.

6 stars on GitHub. Last updated 2026-05-25. Licensed MIT.

Use cases

  • Benchmark agent output quality against defined criteria
  • Automatically catch safety violations during agent execution
  • Enforce per-call or cumulative cost limits to prevent budget overruns

Pros

  • Offers a formal evaluation standard for MCP-based agents
  • Combines quality, safety, and cost checks in one tool
  • Written in TypeScript for type-safe integration

Cons

  • Very low GitHub star count (6) suggests limited community adoption
  • Tightly coupled to the MCP ecosystem, not useful outside it
  • Requires agent infrastructure already built on MCP

Indexed from awesome-mcp-servers-punkpeye and enriched against its public facts.

Pros

  • Offers a formal evaluation standard for MCP-based agents
  • Combines quality, safety, and cost checks in one tool
  • Written in TypeScript for type-safe integration

Cons

  • Very low GitHub star count (6) suggests limited community adoption
  • Tightly coupled to the MCP ecosystem, not useful outside it
  • Requires agent infrastructure already built on MCP