Skip to content

Excessive RAM usage for DBI::dbWriteTable() and dplyr::collect() #97

Open
@krlmlr

Description

@krlmlr

32GB RAM use when writing a 16GB file, and also RAM use reaching 32GB momentarily when reading the same 16GB file. From #72 (comment), by @SimonCoulombe.

library(DBI)
library(dplyr)
library(dbplyr)
library(duckdb)

duckdb_path <- "/devroot/sandbox/tmp/duckdb.duckdb"

con <- dbConnect(duckdb::duckdb(dbdir = duckdb_path))
dbExecute(con, "PRAGMA threads=1; PRAGMA memory_limit='1GB';")

# run this once to create the duckdb file  then restart session:
if (FALSE){
  bigdata <-  data.table::rbindlist(rlang::rep_along(1:3e6, list(iris)))
  dim(bigdata) # 450M rows, 5 columns
  lobstr::obj_size(bigdata) # 16.20 GB in RAM

  dbWriteTable(con, "straight_from_memory", bigdata)
}


bigdata <- tbl(con, "straight_from_memory") %>% collect()

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions