Skip to content

[BUG]: Tempo drop the interpolated column in Databricks DBR14 #417

@srggrs

Description

@srggrs

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

In Databricks DBR13.3 cluster. This code will return the interpolated column with no problem.

# this table has id, day, ts (timestamp) and signal columns
input_table = spark.table("my_input_table")

assert input_table.columns == ["id", "day", "ts", "signal"], "cols are not the same"

transformed_data = TSDF(input_table, ts_col="ts", partition_cols=["id", "day"])

interpolated = (
    transformed_data.resample(freq="5 minutes", func="mean")
    .interpolate(method="linear")
    .df
)

interpolated.columns == ["id", "day", "ts", "signal"], "cols are not the same"

Expected Behavior

When upgrading to DBR14 I would expect there are no columns dropped and the interpolated dataframe has the same columns as the input one

Steps To Reproduce

  1. Set up a compute cluster with DBR13 LTS and one with DBR 14 LTS
  2. have an input table with similar columns as above, perharps even just one partition column
  3. Run the code above to see the difference between the two enviroments

Cloud

AWS

Version

dbl-tempo==0.1.27

Relevant log output

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions