Skip to content

[Bug] Server crashes when I select the incorrect set of primary keys for Value diff #617

Open
@LePeti

Description

@LePeti

Current Behavior

I've selected the wrong primary keys in value diff and the server crashed.

logs:

Future exception was never retrieved
future: <Future finished exception=RecceException('Invalid primary key: date_year. The column should be unique. Please check by this sql: \'\n\nselect\n    date_year as unique_field,\n    count(*) as n_records\n\nfrom "live"."sponsored_collections"."partner_revenue_status_changes_yearly"\nwhere date_year is not null\ngroup by date_year\nhaving count(*) > 1\n\n\'')>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.10/site-packages/recce/apis/run_func.py", line 139, in fn
    raise e
  File "/usr/local/lib/python3.10/site-packages/recce/apis/run_func.py", line 132, in fn
    result = task.execute()
  File "/usr/local/lib/python3.10/site-packages/recce/tasks/valuediff.py", line 230, in execute
    self._verify_primary_key(dbt_adapter, primary_key, model)
  File "/usr/local/lib/python3.10/site-packages/recce/tasks/valuediff.py", line 71, in _verify_primary_key
    raise RecceException(
recce.exceptions.RecceException: Invalid primary key: date_year. The column should be unique. Please check by this sql: '

select
    date_year as unique_field,
    count(*) as n_records

from "live"."sponsored_collections"."partner_revenue_status_changes_yearly"
where date_year is not null
group by date_year
having count(*) > 1

'
./run-scripts/start-recce.sh: line 40: 12986 Killed                  recce server --host "$RECCE_SERVER_HOST" --port "$RECCE_SERVER_PORT"
make: *** [makefile:95: start-recce] Error 137
/usr/local/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 4 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

Expected Behavior

I'd expect an error message instead and then the ability to select again. Potentially showing me a sample of the duplicate values and a query I can use to double check.

Steps To Reproduce

  1. I started recce locally (v.054) and opened the UI
  2. On lineage the changed model was already selected
  3. navigated to explore change > value diff
  4. selected the wrong primary keys
  5. server crashes

Relevant log output

Environment

  • recce: 0.54
  • OS: MacOS 15.3.1
  • Python: 3.10.16
  • Data Warehouse: aws redshift
  • dbt: 1.8.3

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriageTriage required

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions