Skip to content

Added features for Nessie branch management#315

Open
lorepas wants to merge 1 commit into
dremio:mainfrom
lorepas:main
Open

Added features for Nessie branch management#315
lorepas wants to merge 1 commit into
dremio:mainfrom
lorepas:main

Conversation

@lorepas
Copy link
Copy Markdown

@lorepas lorepas commented Jan 20, 2026

Summary

This PR contains code updates that enable branch management for Nessie source code.

Description

The changes made allowed us to specify two new configurations:

  • branch: allows you to specify a branch. If used in a materialized application, it creates the PDS on the specified branch (if the branch doesn't exist, it creates it first). If used in a VDS, however, it creates it from the PDS and the specified branch.
  • nessie_ref: to be inserted in a materialized application. When present, it creates the new branch from the reference entered; otherwise, if the branch doesn't exist, it is created from main.

Test Results

  • Dremio OSS: 25.1.1
  • Nessie: 0.105.3
  • dbt-dremio: 1.10.0

I performed the following tests with this stack:

  • Created a Nessie-type source on Dremio
  • Created a tag named "init" from the Dremio interface
  • In the dbt project, I created a create_pds.sql file for the PDS to be created on Nessie and a my_first_query.sql file for the VDS to be created from the PDS on a Space in the Staging folder:
{{ config(
object_storage_source="mynessiesource",
materialized="table",
branch="test_1"
) }}

SELECT * FROM Samples."samples.dremio.com"."NYC-taxi-trips.csv"
{{ config(
schema = "Staging",
materialized="view",
branch="test_1"
) }}

select *
from {{ ref('create_pds') }}
  • Starting the project with dbt run correctly creates the test_1 branch and creates pds and vds with the correct points to the specified branch.

  • If we rerun dbt run without changing anything, it checks whether a table with that name already exists in the branch; if it does, it is deleted and then recreated.

  • In create_pds.sql, you can also add the nessie_ref config, specifying a valid reference (for example, the tag created at the beginning):

{{ config(
object_storage_source="mynessiesource",
materialized="table",
branch="test_2",
nessie_ref="init"
) }}

SELECT * FROM Samples."samples.dremio.com"."NYC-taxi-trips.csv"
  • This way, the test_2 branch is created starting from the init tag. This can be useful if we want to create PDSs starting from specific tags/branches/commits.
  • From Dremio, I see that the test_2 branch is correctly created starting from the tag.
  • Without specifying branch or nessie_ref config, the behavior is the usual one.

Changelog

  • Added a summary of what this PR accomplishes to CHANGELOG.md

Contributor License Agreement

Related Issue

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


lorenzop seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants