Enterprise DNA
O Open Source Frameworks medium

Evidently

by Community

Evidently is ​​an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.

E

OSS

Evidently

Added 1 June 2026

#data-drift #data-quality #data-science #data-validation #generative-ai #hacktoberfest #html-report #jupyter-notebook

Overview

Evidently is an open-source framework for ML and LLM observability. It evaluates, tests, and monitors AI systems and data pipelines. It supports tabular data and generative AI with over 100 metrics.

Best for

Best for
Data scientists and ML engineers who need a comprehensive, open-source observability framework for both traditional models and LLMs.

Use cases

  • Evaluating LLM outputs against ground truth
  • Monitoring data drift in production ML pipelines
  • Testing model performance with pre-defined test suites

Notes

Evidently is an open-source framework for ML and LLM observability. It evaluates, tests, and monitors AI systems and data pipelines. It supports tabular data and generative AI with over 100 metrics.

7,561 stars on GitHub. Last updated 2026-05-02. Licensed Apache-2.0.

Use cases

  • Evaluating LLM outputs against ground truth
  • Monitoring data drift in production ML pipelines
  • Testing model performance with pre-defined test suites

Pros

  • Open-source with large community support
  • Covers both ML and LLM observability with 100+ metrics
  • Integrates with Jupyter notebooks for exploratory analysis

Cons

  • Primarily designed for notebook environment, less turnkey for production deployment
  • Steep learning curve for setting up custom monitoring pipelines
  • Limited to Python ecosystem

Indexed from awesome-llm and enriched against its public facts.

Pros

  • Open-source with large community support
  • Covers both ML and LLM observability with 100+ metrics
  • Integrates with Jupyter notebooks for exploratory analysis

Cons

  • Primarily designed for notebook environment, less turnkey for production deployment
  • Steep learning curve for setting up custom monitoring pipelines
  • Limited to Python ecosystem