Skip to content

[Bug] TABLE_NOT_FOUND {{tmp_relation}} when there are zero batches to process in incremental model #656

Closed as not planned
@antonysouthworth-halter

Description

Is this a new bug in dbt-athena?

  • I believe this is a new bug in dbt-athena
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

https://github.com/dbt-athena/dbt-athena/blob/34633a7f5679344852d1004991fc814ab385dadb/dbt/include/athena/macros/materializations/models/table/create_table_as.sql#L182

If there are zero batches to process, this results in query error due to TABLE_NOT_FOUND later on, here:

https://github.com/dbt-athena/dbt-athena/blob/34633a7f5679344852d1004991fc814ab385dadb/dbt/include/athena/macros/materializations/models/helpers/get_partition_batches.sql#L11

because the relation does not exist.

Expected Behavior

I would expect the model to complete and just load no data.

Steps To Reproduce

You will need:

  • an incremental model, partitioned, with more than 100 partitions, force_batch=True, strategy=append
  • conditions that result in zero data being loaded

e.g. in our case, our model has a clause like

select ...
from ...

where ...
{% if is_incremental() %}
-- don't load data that's already been loaded
and timestamp > (select max(timestamp) from {{this}})

-- don't load data from hours that have not completed, allowing for 30 minutes of lateness
and timestamp < (date_trunc('hour', now() - interval '30' minute))
{% endif %}

So basically, once the model loaded the data for the last completed hour, it will never load data again until at least 30 minutes passed the next hour.

Pretty edge-casey, and we probably could change to insert_overwrite with some effort, but at the same time I don't think the adapter should error out here because there are genuine cases where there might be zero data to load. For example if you are running dbt as part of some other workload and you need to retry the whole thing. Basically it should be idempotent is what I'm saying haha.

Environment

- OS: Darwin 22.6 but we observe the same under Debian running on Fargate in AWS.
- Python: 3.9.19
- dbt: 1.8.0rc2
- dbt-athena-community: 1.8.0rc1

Additional Context

TBH I can probably implement the fix myself, just wanted to ask if there's a specific reason we CTAS on the first batch here rather than CREATE TABLE AS SELECT ... WITH NO DATA before the fore loop and then just INSERT all batches?

Metadata

Metadata

Assignees

No one assigned

    Labels

    type:bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions