Skip to content

Overriding DBR cluster installed packages with DBX #847

@Edwardp17

Description

@Edwardp17

Expected Behavior

Production scikit-learn version to be 1.3.0.

Current Behavior

I'm seeing that in production the scikit-learn version is 1.0.2 which corresponds to the DBR version of my workflow cluster (12.2).

Steps to Reproduce (for bugs)

Context

I'm using the DBX workflow to build my package using poetry and run certain code from it on Databricks clusters as jobs using DBX workflow definitions. I am expecting that the code running in the Databricks cluster, after being executed via DBX, will have the same environment as I do locally given that locally I am building my environment using poetry, and DBX also builds my wheel using poetry (same dependency specifications). I am attempting to use DBX as a way to centralize my projects configurations so that I can use the same dependency versions in both my development and production environments. Is there anyway to force DBR clusters to use my poetry-specified dependencies with DBX (as in, override the cluster installed packages through DBX)?

Your Environment

  • dbx version used: 0.8.18
  • Databricks Runtime version: 12.2 LTS ML GPU

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions