Bug: tables written to Databricks are unusable if any columns contain only `NA_character_`

This is a weird one which took a while to get to the root cause of!

Reproducible like this:

``` r
sc <- spark_connect(method = "databricks_connect", serverless = TRUE, version = "16.1")

df <- tibble(x = NA_character_)
tbl_name <- "x.y.z"
spark_df <- sparklyr::copy_to(sc, df, tbl_name)
spark_write_table(spark_df, tbl_name, "overwrite")
```

Strangely, the data shows up in the Databricks GUI with Type 'void':

![Image](https://github.com/user-attachments/assets/1214429d-d5e5-4c46-a54b-aa8bd9ee6ade)

But when I try and check the Sample Data I get an error, and I'm also unable to access the data through normal pyspark methods:

![Image](https://github.com/user-attachments/assets/694d7ed8-b167-4169-88c0-138b22d18f43)

I don't think this issue manifests with other datatypes, e.g. plain old `NA` (logical).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: tables written to Databricks are unusable if any columns contain only `NA_character_` #149

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Bug: tables written to Databricks are unusable if any columns contain only NA_character_ #149

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Bug: tables written to Databricks are unusable if any columns contain only `NA_character_` #149