Enterprise DNA
O Open Source Observability medium

whylogs

by Community

An open-source data logging library for machine learning models and data pipelines. ๐Ÿ“š Provides visibility into data quality & model performance over time. ๐Ÿ›ก๏ธ Supports privacy-pre

W

OSS

whylogs

Added 1 June 2026

#ai-pipelines #analytics #approximate-statistics #calculate-statistics #constraints #data-constraints #data-pipeline #data-quality

Overview

An open-source library for logging data profiles from machine learning models and pipelines. It tracks data quality metrics and model performance over time while supporting privacy-preserving data collection.

Best for

Best for
Teams needing lightweight, privacy-aware data quality logging for ML pipelines

Use cases

  • Monitor data drift in production ML pipelines
  • Audit data quality before training or inference
  • Log model predictions with statistical summaries

Notes

An open-source library for logging data profiles from machine learning models and pipelines. It tracks data quality metrics and model performance over time while supporting privacy-preserving data collection.

2,819 stars on GitHub. Last updated 2025-01-10. Licensed Apache-2.0.

Use cases

  • Monitor data drift in production ML pipelines
  • Audit data quality before training or inference
  • Log model predictions with statistical summaries

Pros

  • Open-source and community-backed
  • Privacy-preserving data collection capabilities
  • Tracks data quality and model performance over time

Cons

  • Not a standalone monitoring solution, requires additional tooling for production deployment
  • Limited to statistical profiling, no built-in alerting
  • Relatively small community compared to larger observability platforms

Indexed from awesome-llmops and enriched against its public facts.

Pros

  • Open-source and community-backed
  • Privacy-preserving data collection capabilities
  • Tracks data quality and model performance over time

Cons

  • Not a standalone monitoring solution, requires additional tooling for production deployment
  • Limited to statistical profiling, no built-in alerting
  • Relatively small community compared to larger observability platforms