Skip to content

cluster_pairwise_predictions_at_threshold is crashing on Databricks in v4.0.11 #2835

@Brice543

Description

@Brice543

What happens?

Since the new version the function linker.clustering.cluster_pairwise_predictions_at_threshold() is crashing at the end of the process due to missing tables.
It's saying that the tables "filtered_neighbours" are not existing in Databricks.

I check and seems to be because the drop command is execute twice in the connected_component.py. Should maybe run the command "DROP IF EXIST" to secure this behaviour.

Image

To Reproduce

  1. Get the latest version of the librairie on Databricks.
  2. Run a classical run until the clustering part
  3. Run the function clustering.cluster_pairwise_predictions_at_threshold() ==> Crashing due to missing tables

OS:

Databricks DBR

Splink version:

4.0.11

Have you tried this on the latest master branch?

  • I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • I agree

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions