Skip to content

Roadmap for v1.6.0 #145

Open
Open
@kspalaiologos

Description

This ticket is meant to be a collective TODO list for the v1.6.0 release with all the major features that I am planning.

  • Bite into the support of machine-specific and OS-specific code. Copy over some code from xpar - particularly NASM detection in m4 files, CRC32C implementations, CPU feature detection. We could gate it behind the same architecture-specific flag options as xpar. Unfortunately, xpar is currently packaged only for NixOS and it is impossible, due to the NixOS specifics, to allow machine-specific optimisations to ever execute. As such it is actually more portable(!) to add assembly source units to the program. Not to mention problems with the compilers miscompiling the hot loops.
    • Add hardware CRC32 support.
    • Add some code to detect the amount of available processors to use with -j 0. We want a C analogue of std::thread::hardware_concurrency(). Maybe determine the amount of CPUs by task affinity (sched_getaffinity - Linux-specific), sysconf (GNU-only), get_nprocs (also GNU), or maybe read /proc/cpuinfo... Another possibility is pthread_getaffinity_np, or NetBSD 5+ GNU's sched_getaffinity_np, on Windows we would want GetProcessAffinityMask. In practice, the sysconf appears to work on glibc, Mac OS X 10.5, FreeBSD, AIX, OSF/1, Solaris, Cygwin, Haiku. HP-UX would require pstat_getdynamic, IRIX uses sysmp; as a fallback on Windows platforms we could use GetSystemInfo. Possible m4 code of interest when it comes to making the heads and tails of this mess.
    • Improved work-stealing concurrency for parallel encoding and decoding in the CLI tool. Currently we read the blocks in one go, then encode them in one go and write them back again as a single whole stage. It would be an improvement to perform the encoding and decoding in parallel with I/O operations to get more mileage out of parallelism on slow disks.
    • Preserve the ownership, permissions, atime/ctime/mtime metadata of input files into output files. I am skeptical because I don't want to spend a lot of time catering to portability and making sure that we also respect Windows ACLs, for example. I think that doing so in certain cases requires elevated privileges too.
    • Memory mapped I/O for faster operation in the CLI stub.
  • Stuff that requires a new format:
    • Undo the arithmetic coding stage if it doesn't yield satisfactory results. This way we store the data verbatim and at some overhead in the encoding speed department we improve the decode performance on incompressible segments drastically.
    • Add a special "end of file" block marker that preserves data integrity on truncated streams.
    • Unify frame/CLI tool formats.
  • Miscellany
    • Document the current file format better getting us a few steps closer to a 3rd party being able to produce a valid encoder or decoder independently of the current source code.
    • Investigate libcubwt.
    • Clean up the code.
  • Finished tasks:
    • Update the soname for ABI-breaking versions (DONE IN COMMIT f3b4730).
    • Rework the code to use yarg instead of the getopt_long shim (DONE IN COMMIT 249b173).
      • Potentially handle OOMs in yarg within the CLI tool.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions