🗂️ Navigation

Apache Iceberg

The open table format for huge analytic datasets.

Visit Website →

Overview

Apache Iceberg is an open table format for huge analytic datasets. It manages large collections of files as tables and supports modern analytical data lake operations such as record-level insert, update, delete, and time travel. Iceberg is designed to be used with any query engine on any cloud storage.

✨ Key Features

  • Schema evolution
  • Hidden partitioning
  • Time travel and version rollback
  • ACID transactions
  • Engine-agnostic (works with Spark, Trino, Flink, etc.)

🎯 Key Differentiators

  • Engine-agnostic design
  • Strong focus on correctness and reliability
  • Hidden partitioning for improved performance and ease of use

Unique Value: Provides a reliable and open foundation for your data lake, with features like ACID transactions, schema evolution, and time travel, while remaining independent of any specific query engine.

🎯 Use Cases (5)

Building reliable and scalable data lakes Data warehousing on the data lake Streaming data ingestion Data governance and compliance Incremental data processing

✅ Best For

  • Creating a reliable and performant data lake with ACID transactions and schema evolution.

💡 Check With Vendor

Verify these considerations match your specific requirements:

  • A replacement for a transactional database (OLTP).

🏆 Alternatives

Delta Lake Apache Hudi

Offers a more open and engine-agnostic approach compared to Delta Lake, which is closely associated with Spark. It provides a more robust and feature-rich solution than using plain Parquet or ORC files.

💻 Platforms

Any platform that supports a compatible query engine and file system.

✅ Offline Mode Available

🔌 Integrations

Apache Spark Apache Flink Trino Dremio Snowflake

💰 Pricing

Contact for pricing
Free Tier Available

Free tier: Open source and free to use

Visit Apache Iceberg Website →