Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,15 @@ This directory contains self-contained, end-to-end demo stacks for OLake. Each e
# 1) Start base Olake stack
curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - up -d

# 2) Start an example
# 2) Clone the repository and navigate to root directory
git clone https://github.com/datazip-inc/olake.git
cd olake

# 3) Start an example
cd examples/presto-tabularest-minio-mysql
docker compose up -d

# 3) Follow suggested steps in README.md for the example
# 4) Follow suggested steps in README.md for the example
```

Each example’s `README.md` includes:
Expand Down
32 changes: 13 additions & 19 deletions examples/presto-tabularest-minio-mysql/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,7 @@ This example demonstrates a complete data pipeline using:

## Quick Start

### 1. Start the Base Olake Stack

```bash
curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - up -d
```

### 2. Start the Demo Stack
### 1. Start the Demo Stack

```bash
# Navigate to this example directory
Expand All @@ -39,7 +33,7 @@ cd examples/presto-tabularest-minio-mysql
docker compose up -d
```

### 3. Accessing Services
### 2. Accessing Services
1. **Log in** to the Olake UI at [http://localhost:8000](http://localhost:8000) with credentials `admin`/`password`.

2. **Verify Source Data:**
Expand All @@ -56,7 +50,7 @@ docker compose up -d

3. **Create and Configure a Job:**
Create a Job to define and run the data pipeline:
* On the main page, click on the **"Create your first Job"** button.
* On the main page, click on the **"Create your first Job"** button. Set job name and replication frequency.

* **Set up the Source:**
* **Connector:** `MySQL`
Expand All @@ -67,6 +61,8 @@ docker compose up -d
* **Database:** `weather`
* **Username:** `root`
* **Password:** `password`
* **SSH Config:** `No Tunnel`
* **Update Method:** `Standalone`

* **Set up the Destination:**
* **Connector:** `Apache Iceberg`
Expand All @@ -75,34 +71,31 @@ docker compose up -d
* **Version:** chose the latest available version
* **Iceberg REST Catalog URI:** `http://host.docker.internal:8181`
* **Iceberg S3 Path:** `s3://warehouse/weather/`
* **Iceberg Database:** `weather`
* **Database:** `weather`
* **S3 Endpoint (for Iceberg data files written by Olake workers):** `http://host.docker.internal:9090`
* **AWS Region:** `us-east-1`
* **S3 Access Key:** `minio`
* **S3 Secret Key:** `minio123`

* **Select Streams to sync:**
* Select the weather table using checkbox to sync from Source to Destination.
* Click on the weather table and set Normalisation to `true` using the toggle button.

* **Configure Job:**
* Set job name and replication frequency.
* Make sure that the weather table has been selected for the sync.
* Click on the weather table and make sure that the Normalisation is set to `true` using the toggle button.

* **Save and Run the Job:**
* Save the job configuration.
* Run the job manually from the UI to initiate the data pipeline from MySQL to Iceberg by clicking **Sync now**.

### 4. Query Data with Presto
### 3. Query Data with Presto

1. **Access Presto UI:** [http://localhost:8088](http://localhost:8088)

2. **Run Queries:**
- Click on **SQL CLIENT** at the top
- Select **Catalog:** `iceberg`, **Schema:** `weather`
- Select **Catalog:** `iceberg`, **Schema:** `{job_name}_weather`
- Query example:
```sql
SELECT station_state, AVG(temperature_avg) as avg_temp
FROM iceberg.weather.weather
FROM iceberg.{job_name}_weather.weather
GROUP BY station_state
ORDER BY avg_temp DESC
LIMIT 10;
Expand Down Expand Up @@ -132,7 +125,8 @@ SELECT * FROM weather LIMIT 5;
### Test Presto Connection
```bash
# Check if Presto can see Iceberg tables
docker exec -it olake-presto-coordinator presto-cli --catalog iceberg --schema weather --execute "SHOW TABLES;"
# Make sure to replace {job_name} with your actual job name.
docker exec -it olake-presto-coordinator presto-cli --catalog iceberg --schema {job_name}_weather --execute "SHOW TABLES;"
```

### Common Issues
Expand Down
37 changes: 15 additions & 22 deletions examples/trino-tablurarest-minio-mysql/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,7 @@ This example demonstrates an end-to-end data lakehouse pipeline:

## Quick Start

### 1. Start the Base OLake Stack

```bash
curl -sSL https://raw.githubusercontent.com/datazip-inc/olake-ui/master/docker-compose.yml | docker compose -f - up -d
```

### 2. Start the Demo Stack
### 1. Start the Demo Stack

```bash
# Navigate to this example directory
Expand All @@ -36,7 +30,7 @@ cd examples/trino-tablurarest-minio-mysql
docker compose up -d
```

### 3. Accessing Services
### 2. Accessing Services

1. **Log in** to the OLake UI at [http://localhost:8000](http://localhost:8000) with credentials `admin`/`password`.

Expand All @@ -54,7 +48,7 @@ docker compose up -d

3. **Create and Configure a Job:**
Create a Job to define and run the data pipeline:
* On the main page, click on the **"Create your first Job"** button.
* On the main page, click on the **"Create your first Job"** button. Please make sure to **set the job name** as `job` and select a replication frequency.

* **Set up the Source:**
* **Connector:** `MySQL`
Expand All @@ -65,45 +59,44 @@ docker compose up -d
* **Database:** `weather`
* **Username:** `root`
* **Password:** `password`
* **SSH Config:** `No Tunnel`
* **Update Method:** `Standalone`

* **Set up the Destination:**
* **Connector:** `Apache Iceberg`
* **Catalog:** `REST catalog`
* **Name of your destination:** `olake_iceberg`
* **Version:** chose the latest available version
* **Version:** choose the latest available version
* **Iceberg REST Catalog URI:** `http://host.docker.internal:8181`
* **Iceberg S3 Path:** `s3://warehouse/weather/`
* **Iceberg Database:** `weather`
* **Database:** `weather`
* **S3 Endpoint (for Iceberg data files written by OLake workers):** `http://host.docker.internal:9090`
* **AWS Region:** `us-east-1`
* **S3 Access Key:** `minio`
* **S3 Secret Key:** `minio123`

* **Select Streams to sync:**
* Select the weather table using checkbox to sync from Source to Destination.
* Click on the weather table and set Normalisation to `true` using the toggle button.

* **Configure Job:**
* Set job name and replication frequency.
* Make sure that the weather table has been selected for the sync.
* Click on the weather table and make sure that the Normalisation is set to `true` using the toggle button.

* **Save and Run the Job:**
* Save the job configuration.
* Run the job manually from the UI to initiate the data pipeline from MySQL to Iceberg by clicking **Sync now**.

### 4. Query Data with Trino
### 3. Query Data with Trino

1. **Access SQLPad UI:** [http://localhost:3000](http://localhost:3000)
1. **Access SQLPad UI:** [http://localhost:3000](http://localhost:3000) using credentials `admin`/`password`

2. **Run Queries via SQLPad UI:**
* On the top left, select **OLake Demo** as the database
* Click on the **refresh button** to reload the database schemas
* Click on **weather** schema and the **weather** table under it will be listed
* Click on **job_weather** schema and the **weather** table under it will be listed
* Enter below SQL Query on the Text Box and click **Run** to execute the query

3. **Query example:**
```sql
SELECT station_state, AVG(temperature_avg) as avg_temp
FROM iceberg.weather.weather
FROM job_weather.weather
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Trino SQLPad Configuration Schema Naming Issue

The Trino example's SQLPad configuration hardcodes the schema to job_weather. This requires users to name their OLake job exactly "job" for the example to function, creating an inconsistency with other examples that support dynamic job naming.

Additional Locations (1)

Fix in Cursor Fix in Web

GROUP BY station_state
ORDER BY avg_temp DESC
LIMIT 10;
Expand All @@ -112,7 +105,7 @@ docker compose up -d
4. **(Optional) Run Queries via Trino CLI:**
```bash
docker exec -it olake-trino-coordinator trino \
--catalog iceberg --schema weather \
--catalog iceberg --schema job_weather \
--execute "SELECT * from weather LIMIT 10;"
```

Expand Down Expand Up @@ -146,7 +139,7 @@ SELECT * FROM weather LIMIT 5;
```bash
# Check if Trino can see Iceberg tables
docker exec -it olake-trino-coordinator trino \
--catalog iceberg --schema weather \
--catalog iceberg --schema job_weather \
--execute "SHOW TABLES;"
```

Expand Down
2 changes: 1 addition & 1 deletion examples/trino-tablurarest-minio-mysql/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,7 @@ services:
SQLPAD_CONNECTIONS__olake__port: 8088
SQLPAD_CONNECTIONS__olake__username: admin
SQLPAD_CONNECTIONS__olake__catalog: iceberg
SQLPAD_CONNECTIONS__olake__schema: weather
SQLPAD_CONNECTIONS__olake__schema: job_weather
networks:
- olake-network
depends_on:
Expand Down
Loading