Skip to content

generate_compressor_model.py generates invalid .h files #27

@hansmaad

Description

@hansmaad

I'm using a quite large file (7mb), ASCII chars only. The words look like this:

hninetrakierf hnisnoitknuf hrettuf hcsnedobSuf htsrUf g hbeilnetrag htsag...

With default options, the tool generates a model file where each element in successor_ids_by_chr_id_and_chr_id looks like this:

static const int8_t successor_ids_by_chr_id_and_chr_id[32][32] = {
  {-1, 9, 3, 10, 14, -1, 4, 1, 6, 5, -1, 0, ... -1, -1, -1, -1, -1, None},
  {0, -1, 1, 7, 4, 6, 2, 3, 5, 8, 12, 10, 1..., -1, -1, -1, -1, -1, None},

If you have no time to fix this, do you have any idea how I could repair this file?
If i replace None with -1, the compression is a little bit better than the default shoco_model.h file, but maybe I could do better?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions