In this poc we are streaming cdc events from Mysql to Iceberg tables in Minio(S3).
You can pass multiple valid queries in these files:
- source_queries.txt
- sink_queries.txt
- queries.txt
All the queries should be valid and end with
;, as the parser will use this character to split multiple queries.
- Start docker containers. Run
docker-compose build & docker-compose up -d - Build the jar.
This will build jar in :
mvn clean install
target/cdc-iceberg-poc-1.0-SNAPSHOT.jar - Open flink ui in browser: http://localhost:8081
- Go to
Submit New Jobin the left menu. - Upload the jar from target folder.
- Click on the uploaded jar and in the Program Argument box paste the commandline options for the app:
-s /data/source_queries.txt -o /data/sink_queries.txt -p /data/queries.txt
- Submit the job. Ignore the error pop up message if it says that could not wait for the job to over. Check the running jobs section.
- Open the minio in browser: http://localhost:9001
- Take the username and password from the docker-compose.yaml file. Check the bucket for data.
- Open the spark notebook in browser: http://localhost:8888
- Open notebook/Query_Iceberg_Tables.ipynb and run the notebook to see the iceberg table data.