Skip to content

Reuse already materialized data frames #442

Open
@krlmlr

Description

@krlmlr

We want to reuse df2 once it's materialized.

In duckdb, need a new AltrepDataFrameRelation that wraps an ALTREP data frame and forwards to either the parent relation or to a relation that implements the data frame scan.

options(conflicts.policy = list(warn = FALSE))
library(duckplyr)
#> Loading required package: dplyr
#> ✔ Overwriting dplyr methods with duckplyr methods.
#> ℹ Turn off with `duckplyr::methods_restore()`.

df1 <- duck_tbl(a = 1)
df2 <- df1 |> mutate(b = 2)
df3 <- df2 |> filter(b == 2)

duckdb:::rel_from_altrep_df(df2)
#> DuckDB Relation: 
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Projection [a as a, 2.0 as b]
#>   r_dataframe_scan(0x115848e88)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - a (DOUBLE)
#> - b (DOUBLE)
duckdb:::rel_from_altrep_df(df3)
#> DuckDB Relation: 
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Filter [(b = 2.0)]
#>   Projection [a as a, 2.0 as b]
#>     r_dataframe_scan(0x115848e88)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - a (DOUBLE)
#> - b (DOUBLE)

collect(df2)
#> # A tibble: 1 × 2
#>       a     b
#>   <dbl> <dbl>
#> 1     1     2

# Here, we can already use df2 and don't need to compute anything
duckdb:::rel_from_altrep_df(df2)
#> DuckDB Relation: 
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Projection [a as a, 2.0 as b]
#>   r_dataframe_scan(0x115848e88)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - a (DOUBLE)
#> - b (DOUBLE)
duckdb:::rel_from_altrep_df(df3)
#> DuckDB Relation: 
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Filter [(b = 2.0)]
#>   Projection [a as a, 2.0 as b]
#>     r_dataframe_scan(0x115848e88)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - a (DOUBLE)
#> - b (DOUBLE)

Created on 2025-01-04 with reprex v2.1.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    duckdb 🦆Issues where work in the duckb package is needed

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions