Open
Description
This ticket is meant to be a collective TODO list for the v1.6.0 release with all the major features that I am planning.
- Bite into the support of machine-specific and OS-specific code. Copy over some code from xpar - particularly NASM detection in m4 files, CRC32C implementations, CPU feature detection. We could gate it behind the same architecture-specific flag options as xpar. Unfortunately, xpar is currently packaged only for NixOS and it is impossible, due to the NixOS specifics, to allow machine-specific optimisations to ever execute. As such it is actually more portable(!) to add assembly source units to the program. Not to mention problems with the compilers miscompiling the hot loops.
- Add hardware CRC32 support.
- Add some code to detect the amount of available processors to use with
-j 0
. We want a C analogue ofstd::thread::hardware_concurrency()
. Maybe determine the amount of CPUs by task affinity (sched_getaffinity
- Linux-specific),sysconf
(GNU-only),get_nprocs
(also GNU), or maybe read/proc/cpuinfo
... Another possibility ispthread_getaffinity_np
, or NetBSD 5+ GNU'ssched_getaffinity_np
, on Windows we would wantGetProcessAffinityMask
. In practice, thesysconf
appears to work on glibc, Mac OS X 10.5, FreeBSD, AIX, OSF/1, Solaris, Cygwin, Haiku. HP-UX would requirepstat_getdynamic
, IRIX usessysmp
; as a fallback on Windows platforms we could useGetSystemInfo
. Possible m4 code of interest when it comes to making the heads and tails of this mess. - Improved work-stealing concurrency for parallel encoding and decoding in the CLI tool. Currently we read the blocks in one go, then encode them in one go and write them back again as a single whole stage. It would be an improvement to perform the encoding and decoding in parallel with I/O operations to get more mileage out of parallelism on slow disks.
- Preserve the ownership, permissions, atime/ctime/mtime metadata of input files into output files. I am skeptical because I don't want to spend a lot of time catering to portability and making sure that we also respect Windows ACLs, for example. I think that doing so in certain cases requires elevated privileges too.
- Memory mapped I/O for faster operation in the CLI stub.
- Stuff that requires a new format:
- Undo the arithmetic coding stage if it doesn't yield satisfactory results. This way we store the data verbatim and at some overhead in the encoding speed department we improve the decode performance on incompressible segments drastically.
- Add a special "end of file" block marker that preserves data integrity on truncated streams.
- Unify frame/CLI tool formats.
- Miscellany
- Document the current file format better getting us a few steps closer to a 3rd party being able to produce a valid encoder or decoder independently of the current source code.
- Investigate libcubwt.
- Clean up the code.
- Finished tasks:
Metadata
Assignees
Labels
No labels