compression size on highly redundant data

Hi Mark, I was wondering if you've an explanation as to why the compression from `fst` on this particular data.frame seems to end up with larger size compared to native (rda) format. The entropy of each column is the minimal it could be... Do you think there's room for improvements on such cases?

```r
require(fst) # CRAN  version
df <- data.frame(
        x=rep(1, 1e8),
        y=rep(2, 1e8),
        z=rep(3, 1e8)
      )

fst <- tempfile()
rda <- tempfile()

write.fst(df, fst, compress=100) # 2s
save(list="df", file=rda)        # 22s

file.info(fst)$size/1024 # 5102.4 KB
file.info(rda)$size/1024 # 3410.6 KB
```

Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

compression size on highly redundant data #96

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

compression size on highly redundant data #96

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions