Skip to content

Airbyte connection field path ordering causes spurious changes #172

@alexambarch

Description

@alexambarch

I think it's worth creating an issue to continue the discussion about field path ordering from #169.

Summary

When creating an Airbyte connection, if the field paths for a stream are not defined in order of length (smallest first), then the Airbyte provider will report the stream as having changes, even if the fields selected have not changed.

Steps to reproduce

  1. Use abctl local install to create a clean Airbyte installation.
  2. Spin up a MySQL container for the destination. Set the password and database to whatever you want. You can use a different destination or configuration, I just happened to pick this one.
docker run --name mysql -e MYSQL_ROOT_PASSWORD=airbyte -e MYSQL_DATABASE=xkcd -p 3306:3306 -d mysql:8
  1. Use the following minimal main.tf, filling in the local values for your own:
terraform {
  required_providers {
    airbyte = {
      source  = "airbytehq/airbyte"
      version = "0.6.5"
    }
  }
}

# Fill in your own values here. The server URL should be `https://your-url.local`, no path or trailing slash
locals {
  airbyte_client_id     = "your client ID"
  airbyte_client_secret = "your client secret"
  airbyte_server_url    = "your server url"

  mysql_host            = "your mysql container hostname"
  mysql_database        = "your database"
  mysql_username        = "root"
  mysql_password        = "your root password"
}

provider "airbyte" {
  server_url    = "${local.airbyte_server_url}/api/public/v1"
  client_id     = local.airbyte_client_id
  client_secret = local.airbyte_client_secret
}

resource "airbyte_workspace" "default" {
  name = "airbyte-workspace"
}

resource "airbyte_source_xkcd" "default" {
  name         = "xkcd"
  workspace_id = airbyte_workspace.default.workspace_id

  configuration = {
    comic_number = 1
  }
}

resource "airbyte_destination_mysql" "default" {
  name         = "mysql"
  workspace_id = airbyte_workspace.default.workspace_id

  configuration = {
    host     = local.mysql_host
    port     = 3306
    database = local.mysql_database
    username = local.mysql_username
    password = local.mysql_password

    ssl = false
  }
}

resource "airbyte_connection" "default" {
  name = "airbyte-connection"

  source_id      = airbyte_source_xkcd.default.source_id
  destination_id = airbyte_destination_mysql.default.destination_id

  schedule = {
    schedule_type = "manual"
  }

  configurations = {
    streams = [{
      name      = "xkcd"
      sync_mode = "full_refresh_overwrite"
      selected_fields = [
        {
          field_path = ["month"]
        },
        {
          field_path = ["link"]
        }
      ]
    }]
  }
}

output "airbyte_workspace_url" {
  value = "${local.airbyte_server_url}/workspaces/${airbyte_workspace.default.workspace_id}"
}
  1. Apply the configuration
tf init && tf apply
  1. After the configuration is applied, running either tf plan or tf apply will report changes in the connection's stream, even though you changed nothing.
  2. Reverse the order of the field paths:
        {
          field_path = ["link"]
        },
        {
          field_path = ["month"]
        }
  1. Running a plan or apply will now report no changes.

Ideally, I'd like this behavior either documented or for similar logic from #170 to be implemented so the provider will order the fields automatically. This will help substantially with being able to filter out a lot of noise from my terraform plans. Since working through this issue I've manually edited all of my connections so that they have the required ordering, but it would have been nice to know about this behavior beforehand.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions