Skip to content

Add a size statistics command#237

Merged
sampsyo merged 6 commits intomainfrom
size-stats
Nov 24, 2025
Merged

Add a size statistics command#237
sampsyo merged 6 commits intomainfrom
size-stats

Conversation

@sampsyo
Copy link
Collaborator

@sampsyo sampsyo commented Nov 23, 2025

This cherry-picks the command that @johnpalsberg added in #214 to print out the sizes of each part of a FlatGFA data structure. It is cleanly separable from the rest of that PR, and it would be nice to land immediately so we can do proper experiments in measuring file sizes when we make changes (as is the main focus in #214).

I also changed a few things around to make the code simpler, and I folded this functionality into the toc command that does almost the same thing. You now do fgfa toc -b to see the the table of contents in byte sizes instead of element counts.

Here's an example:

$ fgfa -I tests/LPA.gfa toc -b
header: 8
segs: 90024
paths: 312
links: 83120
steps: 811224
seq_data: 206263
overlaps: 0
alignment: 20780
name_data: 319
optional_data: 56523
line_order: 8960

johnpalsberg and others added 6 commits November 23, 2025 14:34
Use the simpler name `fgfa size`, and adjust internal names to match.
This should go to stdout, not stderr. We also don't need to repeat the
word "bytes"; this should make the output easier to parse in scripts.
Makes it super simple to get the size (in bytes) of each pool.
This seems nice because they do *almost* the same thing? And it will be
more obvious if these two chunks of code get out of sync...
@sampsyo sampsyo merged commit 2373bd5 into main Nov 24, 2025
14 checks passed
@sampsyo sampsyo deleted the size-stats branch November 24, 2025 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants