Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

README.md

Connecting R and StarRocks with ADBC

Instructions

This example uses StarRocks, an open query engine for sub-second, ad-hoc analytics both on and off the data lakehouse.

Tip

If you already have a StarRocks instance running, skip the steps to set up StarRocks.

Prerequisites

  1. Install dbc

  2. Install R packages adbcdrivermanager, arrow, and tibble:

    install.packages(c("adbcdrivermanager", "arrow"))
    install.packages("tibble")

Set up StarRocks

  1. Install Docker

  2. Start a StarRocks instance:

    docker run --rm -p 9030:9030 -p 8030:8030 -p 8040:8040 -p 9408:9408 -p 9419:9419 -itd \
    --name quickstart starrocks/allin1-ubuntu
  3. Configure StarRocks for Arrow Flight SQL using one of the following options:

    Option A: Quick setup (automated)

    Run these commands from your terminal:

    docker exec quickstart sed -i 's/JAVA_OPTS="-Dlog4j2/JAVA_OPTS="--add-opens=java.base\/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED -Dlog4j2/' /data/deploy/starrocks/fe/conf/fe.conf
    docker exec quickstart bash -c 'echo "arrow_flight_port = 9408" >> /data/deploy/starrocks/fe/conf/fe.conf'
    docker exec quickstart bash -c 'echo "arrow_flight_port = 9419" >> /data/deploy/starrocks/be/conf/be.conf'
    docker restart quickstart

    Option B: Manual setup

    If you prefer to understand and apply the changes yourself:

    1. Open a shell inside the container:

      docker exec -it quickstart bash
    2. Edit the FE (frontend) configuration:

      vi /data/deploy/starrocks/fe/conf/fe.conf
      • Find the JAVA_OPTS line and add the Arrow memory module at the beginning:

        JAVA_OPTS="--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED ..."
        
      • Add this line at the end of the file:

        arrow_flight_port = 9408
        
    3. Edit the BE (backend) configuration:

      vi /data/deploy/starrocks/be/conf/be.conf
      • Add this line at the end of the file:
        arrow_flight_port = 9419
        
    4. Exit the container and restart it:

      exit
      docker restart quickstart
  4. Verify the container is ready. Wait for the container to become healthy:

    docker ps --filter "name=quickstart"

    You should see (healthy) in the status before proceeding.

Connect to StarRocks

  1. Install the Flight SQL ADBC driver:

    dbc install flightsql
  2. Customize the R script main.R as needed.

    • Change the connection arguments in adbc_database_init()
      • uri is the URI of your StarRocks instance. The host and FE Arrow Flight port will depend on your installation.
      • username and password are the username and password of your StarRocks user.
    • Change the SQL SELECT statement in read_adbc() if desired.
  3. Run the R script:

    Rscript main.R

Clean up

Stop the Docker container running StarRocks:

docker stop quickstart