Skip to content

datazip-inc/olake-fusion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,824 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

OLake
OLake-Fusion

OLake-Fusion is a lakehouse table management system for Apache Iceberg.
It helps teams run faster queries, lower storage cost, and operate Iceberg at scale with less effort.

GitHub issues Documentation Join Slack Contribute

Why OLake-Fusion

Operating Iceberg in production is powerful, but day-2 operations can be expensive and complex. OLake-Fusion adds an operational layer on top of Iceberg so your team can focus on data products instead of maintenance jobs.

With OLake-Fusion, you can:

  • Keep query performance stable with continuous self-optimization.
  • Reduce storage and compute waste from small-file and metadata overhead.
  • Manage tables consistently across different catalogs and environments.
  • Build infra-decoupled, stream-and-batch-fused, lake-native data platforms.

Architecture

OLake-Fusion architecture

  • Fusion (Management Service): Handles table lifecycle operations such as self-optimization and data expiration, and provides a unified catalog interface across engines.
  • Spark Optimizer: Runs optimization tasks that improve file layout and maintain read efficiency.

Key Features

  • Self-Optimizing Tables: Automatically compacts files and organizes data to keep read latency low.
  • Multi-Catalog Support: Works with catalogs such as Glue, JDBC, and REST-based catalogs.
  • Infrastructure Independent: Deploy on private cloud, public cloud, hybrid cloud, or multi-cloud.
  • Lakehouse Ready: Designed for modern analytics workloads on open table formats.

Benchmark Highlights

  • Up to 2x faster than vanilla Spark compaction in benchmark scenarios.
  • Around 5% better query performance in tested workloads.

Read the full benchmark details: Compaction Benchmark

Quick Start

Start with the first end-to-end setup guide:

Helpful next reads:

Community

Contributing

Contributions of all sizes are welcome.

About

Simple streaming table maintenance by OLake - Compaction, Orphan File Cleanup & Snapshot Expiry

Resources

License

Apache-2.0, Apache-2.0 licenses found

Licenses found

Apache-2.0
LICENSE
Apache-2.0
LICENSE-binary

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors