Skip to content

DELTA_STATE_RECOVER_ERROR in Databricks reading delta tables written by python deltalake #3743

@mudravrik

Description

@mudravrik

Hey folks!

First of all, this issue may have nothing to do with deltalake, but we struggle to get any addition info about it, so though worth asking in deltalake GH as well. Any thoughts are appreciated!

Environment

  • dlt==1.11.0 with deltalake==0.25.1 OR
  • dlt==1.14.1 with deltalake==1.1.0
  • Cloud Storage: AWS S3
  • Databricks Runtime: 16.4 (Spark 3.5.2, Scala 2.13)
  • Write Pattern: All commits are overwrite mode. Occasional schema changes occur as part of the overwrite.
  • Delta protocol written by dlt: {"protocol":{"minReaderVersion":1,"minWriterVersion":2}}

Bug

Summary

We are experiencing intermittent read failures on Delta tables located in S3. These tables are written exclusively by dlt library with deltalake (delta-rs)). Databricks will occasionally fail to read a table with a DELTA_STATE_RECOVER_ERROR, even though the table's transaction log (_delta_log) appears valid and is readable by other query engines. The issue is transient and seems to resolve itself over time without any specific user intervention.

Behavior Observed

When querying an affected Delta table, Databricks intermittently fails with the following error:

  • Reconstructing version number varies depending on the actual table version.
    This error occurs on various read operations, including:
    SELECT * FROM table_name
    DESCRIBE DETAILS table_name
    The failures have been observed in both interactive notebooks and API-driven jobs like from dbt-databricks.

Steps to Reproduce

The issue is spontaneous and we have not found a deterministic way to reproduce it. However, we made it appear intentionally several times under the following conditions:

  • A process uses the Python dlt library with the deltalake destination to write to a Delta table in S3 in overwrite mode.
  • This process runs multiple times, creating new versions of the table, with read queries run from Databricks after every write.
  • After a number of commits (empirically observed to be between 10 and 12), Databricks may begin failing to read the table.

Additional details

  • _delta_log appears valid: Despite the error message suggesting deleted files, the _delta_log directory appears complete. We can successfully read the same "corrupted" table and its history using other tools like DuckDB, which can parse the transaction log without issue.
  • Table copying "fixes" the issue: If we perform an S3-to-S3 copy of the entire table directory (data and _delta_log) to a new location, the new copy of the table is immediately readable in Databricks without any errors. This strongly suggests the underlying files are not corrupt.
  • Cluster restart “fixes” the issue: if we restart the cluster, attached to the notebook producing the error, the table becomes readable within this cluster and notebook.
  • Spontaneous resolution: The error is transient. If we take no action, the table becomes readable again on the same cluster after a variable period, ranging from 5 minutes to a few hours.
  • Minor protocol deviations: We noticed that some transaction log files contain multiple metaData actions within a single commit JSON. While this is atypical, it is valid according to the Delta protocol specification and does not cause issues for other readers.
  • Previous versions of the table are still readable: While error occurs the same cluster within the same notebook is usually able to read another historical version of the same table.
  • Not all the tables affected: We have multiple (~20) tables written from the same setup, however only 2 of them are yet affected.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions