Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

README.md

Connecting Python and StarRocks with ADBC

Instructions

This example uses StarRocks, an open query engine for sub-second, ad-hoc analytics both on and off the data lakehouse.

Tip

If you already have a StarRocks instance running, skip the steps to set up StarRocks.

Prerequisites

  1. Install uv

  2. Install dbc

Set up StarRocks

  1. Install Docker

  2. Start a StarRocks instance:

    docker run --rm -p 9030:9030 -p 8030:8030 -p 8040:8040 -p 9408:9408 -p 9419:9419 -itd \
    --name quickstart starrocks/allin1-ubuntu
  3. Configure StarRocks for Arrow Flight SQL using one of the following options:

    Option A: Quick setup (automated)

    Run these commands from your terminal:

    docker exec quickstart sed -i 's/JAVA_OPTS="-Dlog4j2/JAVA_OPTS="--add-opens=java.base\/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED -Dlog4j2/' /data/deploy/starrocks/fe/conf/fe.conf
    docker exec quickstart bash -c 'echo "arrow_flight_port = 9408" >> /data/deploy/starrocks/fe/conf/fe.conf'
    docker exec quickstart bash -c 'echo "arrow_flight_port = 9419" >> /data/deploy/starrocks/be/conf/be.conf'
    docker restart quickstart

    Option B: Manual setup

    If you prefer to understand and apply the changes yourself:

    1. Open a shell inside the container:

      docker exec -it quickstart bash
    2. Edit the FE (frontend) configuration:

      vi /data/deploy/starrocks/fe/conf/fe.conf
      • Find the JAVA_OPTS line and add the Arrow memory module at the beginning:

        JAVA_OPTS="--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED ..."
        
      • Add this line at the end of the file:

        arrow_flight_port = 9408
        
    3. Edit the BE (backend) configuration:

      vi /data/deploy/starrocks/be/conf/be.conf
      • Add this line at the end of the file:
        arrow_flight_port = 9419
        
    4. Exit the container and restart it:

      exit
      docker restart quickstart
  4. Verify the container is ready. Wait for the container to become healthy:

    docker ps --filter "name=quickstart"

    You should see (healthy) in the status before proceeding.

Connect to StarRocks

  1. Install the Flight SQL ADBC driver:

    dbc install flightsql
  2. Customize the Python script main.py as needed.

    • Change the connection arguments in db_kwargs
      • uri is the URI of your StarRocks instance. The host and FE Arrow Flight port will depend on your installation.
      • username and password are the username and password of your StarRocks user.
    • Change the SQL SELECT statement in cursor.execute() if desired.
  3. Run the Python script:

    uv run main.py

Clean up

Stop the Docker container running StarRocks:

docker stop quickstart