Open
Description
Hi Mark, I was wondering if you've an explanation as to why the compression from fst
on this particular data.frame seems to end up with larger size compared to native (rda) format. The entropy of each column is the minimal it could be... Do you think there's room for improvements on such cases?
require(fst) # CRAN version
df <- data.frame(
x=rep(1, 1e8),
y=rep(2, 1e8),
z=rep(3, 1e8)
)
fst <- tempfile()
rda <- tempfile()
write.fst(df, fst, compress=100) # 2s
save(list="df", file=rda) # 22s
file.info(fst)$size/1024 # 5102.4 KB
file.info(rda)$size/1024 # 3410.6 KB
Thank you.