Delta-Lake
by Community
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
OSS
Delta-Lake
Added 1 June 2026
Overview
An open-source storage framework that provides ACID transactions and schema enforcement on data lakes. It supports compute engines such as Spark, PrestoDB, Flink, Trino, and Hive, enabling a Lakehouse architecture.
Best for
Best for
Data engineers building scalable, reliable Lakehouse architectures on existing data lakes
Use cases
- Building a reliable Lakehouse with ACID transactions on data lakes
- Running batch and streaming pipelines with unified metadata management
- Enforcing schema evolution and data quality constraints across multiple engines
Notes
An open-source storage framework that provides ACID transactions and schema enforcement on data lakes. It supports compute engines such as Spark, PrestoDB, Flink, Trino, and Hive, enabling a Lakehouse architecture.
8,829 stars on GitHub. Last updated 2026-06-01. Licensed Apache-2.0.
Use cases
- Building a reliable Lakehouse with ACID transactions on data lakes
- Running batch and streaming pipelines with unified metadata management
- Enforcing schema evolution and data quality constraints across multiple engines
Pros
- Open-source with strong community backing and 8,829 GitHub stars
- Integrates with a wide range of compute engines and APIs
- Provides time travel and versioning for data recovery and auditing
Cons
- Originally designed for Spark, tight integration with other engines can require extra configuration
- Scala codebase may be less accessible to teams primarily using Python or SQL
- Setup and tuning in non-Spark environments can add operational complexity
Indexed from awesome-llmops and enriched against its public facts.
Pros
- Open-source with strong community backing and 8,829 GitHub stars
- Integrates with a wide range of compute engines and APIs
- Provides time travel and versioning for data recovery and auditing
Cons
- Originally designed for Spark, tight integration with other engines can require extra configuration
- Scala codebase may be less accessible to teams primarily using Python or SQL
- Setup and tuning in non-Spark environments can add operational complexity
Pairs with
Other entries in the index that connect to this one. Click through to see the chain.