Releases: Cyan4973/xxHash
xxHash v0.8.3
xxHash v0.8.3 is a maintenance update, featuring a bug fix and several quality of life improvements.
Bug Fix
- XXH3_128bits_withSecretandSeed()
 Corrects an edge case (#894) that could generate invalid results. Users of this function should upgrade. Thanks to @hltj for the report and fix.
Command-Line Improvements
- Runtime Vector Extension on x86
 xxhsumautomatically detects and employs the best available vector extension (SSE, AVX, etc.) onx86/x64cpus. Previously, this required an explicit build macro; it is now enabled by default. Maintainers can still control it manually, and can disable it withDISPATCH=0.
- --filelist/- --files-from
 Accepts file names from a text file or- stdin, simplifying bulk hashing. Kudos to @Ian-Clowes for the idea and implementation.
- Short GNU Format for XXH3
 A new-H3mode generates XXH3 64-bit hashes with aXXH3_prefix, neatly distinguishing them fromXXH64. The symlinkxxh3sumdefaults to this mode.
Portability
- LoongArch SX SIMD Support
 Includes an optimized LoongArch SX implementation of xxh3 (courtesy of @lrzlin).
- Extended Platform Coverage
 Validated builds for AIX OS and SPARC CPUs.
New Contributors
- @scribam made their first contribution in #859
- @klemensn made their first contribution in #880
- @thestr4ng3r made their first contribution in #878
- @cclauss made their first contribution in #895
- @nathaniel-brough made their first contribution in #906
- @crosdahl made their first contribution in #932
- @pps83 made their first contribution in #931
- @hltj made their first contribution in #894
- @spaette made their first contribution in #953
- @mofosyne made their first contribution in #954
- @Ian-Clowes made their first contribution in #972
- @crrodriguez made their first contribution in #976
- @lrzlin made their first contribution in #981
Full Changelog: v0.8.2...v0.8.3
xxHash v0.8.2
xxHash v0.8.2 is an incremental update featuring multiple small improvements and fixes spread out over ~300 commits.
Faster performance
Several updates by @easyaspi314 and @hzhuang1 impact arm platform, most notably the neon code path. On the M1 Pro, this translates into +20% speed for xxh3 and xxh128 (from 30.0 GB/s to 36 GB/s).
Some of the changes are generic, so other platforms can be affected too, though typically to a lesser extend (~5%).
On wasm, speed fo xxh3 is improved by a large factor x2 to x3 (depending on underlying hardware) through the use of simd128 (@easyaspi314). This is especially efficient under the v8 js engine, notably used by chrome and node.js.
Finally, @hzhuang1 added support for the arm's SVE vector extension. This is useful for server-side aarch64 cpus with hardware support for wide vectors, such as Fujitsu's A64FX.
Fixes and improvements
Notable fixes in this update include the resolution of issues with XXH3 S390x vector implementation, PowerPC vector compilation with IBM XL compiler, and -Og compilation.
Furthermore, the command line interface (CLI) was refined with features such as support for comment lines in check files and commands such as --binary and --ignore-missing (@t-mat). Additionally, issues with filename containing /LF character were resolved.
The build process was also refined, with improvements such as fixing pkgconfig generation with cmake (@ilya-fedin), icc compilation, cmake install directories, and new build options to reduce binary size (@easyaspi314). Dedicated install targets were introduced (@ffontaine), and support for DISPATCH mode in cmake was added (@hzhuang1).
In terms of portability, the update includes the SVE vector implementation of XXH3, compatibility with freestanding environments using XXH_NO_STDLIB, and the ability to build on Haiku. The code has also been validated on m68k and risc-v.
Documentation
XXH3 finally has a written specification, thanks to @adrien1018 !
Source code can also be digested by doxygen to generate code documentation automatically. An instance is now available at homepage.
Erratum
There is a bug in this version when invoking the function XXH3_128bits_withSecretandSeed(), specifically when the parameter seed == 0, and input length < XXH3_MIDSIZE_MAX (< 240 bytes), and the secret is different from the one created with XXH3_generateSecret_fromSeed(), and the user is invoking the Streaming API. The hash values produced in this case are incorrect: as stated in the documentation, they should be == XXH3_128bits_withSeed(). This is fixed in later version and the dev branch , thanks to @hltj.(a9b2f18).
Changelog
- fix  : XXH3 S390xvector implementation (@hzhuang1)
- fix : PowerPC vector compilation with IBM XL compiler (@MaxiBoether)
- perf : improved WASMspeed by x2/x3 usingSIMD128(@easyaspi314)
- perf : improved speed (+20%) for XXH3 on ARM NEON(@easyaspi314)
- cli  : Fix filename contain /LFcharacter (@t-mat)
- cli  : Support #comment lines in--checkfiles (@t-mat)
- cli  : Support commands --binaryand--ignore-missing(@t-mat)
- build: fix -Ogcompilation (@easyaspi314, @t-mat)
- build: fix pkgconfiggeneration withcmake(@ilya-fedin)
- build: fix icccompilation
- build: fix cmakeinstall directories
- build: new build options XXH_NO_XXH3,XXH_SIZE_OPTandXXH_NO_STREAMto reduce binary size (@easyaspi314)
- build: dedicated install targets (@ffontaine)
- build: support DISPATCHmode incmake(@hzhuang1)
- portability: fix x86dispatchwhen building withVisual+ clang-cl (@t-mat)
- portability: SVEvector implementation of XXH3 (@hzhuang1)
- portability: compatibility with freestanding environments, using XXH_NO_STDLIB
- portability: can build on Haiku (@Begasus)
- portability: validated on m68kandrisc-v
- doc : XXH3 specification (@adrien1018)
- doc : improved doxygen documentation (@easyaspi314, @t-mat)
- misc : dedicated sanity test binary (@t-mat)
Full change list (github generated)
- Fix an assert comparison the same values (flagged by PVS Studio in 0.8.1) by @kcgen in #628
- Add GitHub Actions badge for release branch by @t-mat in #633
- Add windows-2022 to ci.yml by @t-mat in #634
- Add macOS matrix to ci.yml by @t-mat in #635
- Fix compilation on RHEL 7 ppc64le (gcc 4.8) by @ellert in #631
- Add clang-cl for MSVC 2019 to ci.yml by @t-mat in #637
- [NEON] Split XXH3 into 6 NEON lanes and 2 scalar lanes on aarch64 by @easyaspi314 in #632
- Fix some ARM/clang-cl feature detection issues by @easyaspi314 in #623
- Add QEMU/gcc matrix to ci.yml by @t-mat in #640
- fix #625 by @Cyan4973 in #638
- fix #627 by @Cyan4973 in #639
- added m68k emulation tests to GA by @Cyan4973 in #643
- Document some nerdy ARM stuff, move scalarRound down. by @easyaspi314 in #642
- fix minor static analyzer warning by @Cyan4973 in #644
- fix man page installation by @Cyan4973 in #648
- fix cmake --install by @Cyan4973 in #649
- Use attribute((aligned)) instead of packed by @Hello71 in #650
- [ARM/AArch64] Fix multiple GCC codegen problems by @easyaspi314 in #651
- removed XXH3 declarations when XXH_NO_XXH3 is defined by @Cyan4973 in #653
- new build macro XXH_NO_STDLIB by @Cyan4973 in #654
- improved nostdlib test by @Cyan4973 in #656
- added attribute((const)) by @Cyan4973 in #657
- added attribute((malloc)) by @Cyan4973 in #658
- added attribute((pure)) by @Cyan4973 in #659
- Documentation update by @easyaspi314 in #661
- Makefile: add dedicated install targets by @ffontaine in #665
- XXH_HAS_C_ATTRIBUTE(x)?! by @easyaspi314 in #662
- do no longer depend on <assert.h>for XXH_STATIC_ASSERT by @Cyan4973 in #670
- Properly fix altivec namespace collisions by @easyaspi314 in #672
- Introduce XXH_SIZE_OPT and XXH_NO_STREAM by @easyaspi314 in #667
- Remove duplicated definition of XXH3_128bits by @mterron in #676
- Removed windows-2016 from ci.yml by @t-mat in #690
- tipi.build instructions by @pysco68 in #688
- Fix issue #695 by @t-mat in #698
- Build fix for Haiku by @Begasus in #696
- Use inline assembler for Power/IBM XL Compiler by @MaxiBoether in #708
- test filename-escape by @Cyan4973 in #710
- avoid add_compile_definitions for cmake < v3.12 by @Cyan4973 in #711
- just more cmake v2.8.12 tests by @Cyan4973 in #721
- CPack Added in #719
- Remove stream loads and slightly improve avx512 seed generation by @goldsteinn in #726
- Fix: brace expansion by @t-mat in #729
- Fix issue #724 by @t-mat in #730
- Remove macOS-10.15 from ci.yml by @t-mat in #736
- blind fix for fallthrough on icc by @Cyan4973 in #718
- Optimize XXH3_accumulate_512_neon by @dougallj in #734
- Fix typos found by codespell by @DimitriPapadopoulos in #739
- ci: fix tipi build error on github CI workflow by @hzhuang1 in #749
- Update GitHub Actions by @DimitriPapadopoulos in #742
- xxhash: support SVE by intrinsic code by @hzhuang1 in #752
- fix issues reported by cppcheck by @hzhuang1 in #746
- CI: fix missing space by @hzhuang1 in #758
- Fixing tipi-build / Build as dependency CI step by @pysco68 in #760
- Customize full accumulating loop for SVE by @hzhuang1 in #756
- added macos-12 test to GH CI by @Cyan4973 in #765
- Small improvement to x86 vectorized hashes and medium-sizes hash. by @goldsteinn in #754
- dispatch: Use attribute((constructor)) on XXH_setDispatch by @goldsteinn in #773
- Fix typo found by codespell by @di...
v0.8.1
xxHash v0.8.1 is a general clean up of the code base, following the stabilization of xxh3 and xxh128 in v0.8.0.
There are a few welcomed evolutions and improvements, but for the most part, this release consists of fixes for multiple corner cases and scenarios, that shall improve usability of libxxhash and xxhsum across a wide range of platforms.
Stable API entry points have not changed, all entry points labelled "stable" will continue to work as intended in this release and future ones.
Improved performance
While the "big picture" is unchanged, there are a few notable improvements.
XXH3 / XXH128 feature a large speed improvement in streaming mode, which is particularly sensible for gcc and MSVC (clang was already in good shape), by as much as +40%, making streaming speed essentially on par with single-shot mode when ingesting large quantities of data.
XXH64 and even XXH32 feature improved latency performance for small inputs of random sizes. Perhaps as importantly, their binary size is smaller.
New capabilities
There is a new experimental XXH3 variant, named _withSecretandSeed(). In a nutshell, it combines seed for small inputs, with secret for large inputs.
The main driver for this variant is a wish to skip the delay from secret's transparent generation when using _withSeed() variant with large inputs, resulting in measurable performance drop for "not so large" sizes (< 1 KB) (note: this delay is insensible for "large" inputs, such as > 256 KB). Coupled with new function XXH3_generateSecret_fromSeed(), which generates the same secret as the one generated internally when using the _withSeed() variant, it results in exactly the same return values, while skipping the secret generation stage, thus improving speed.
Experimental XXH3_generateSecret() has been extended to allow generation of secret of any size (though respecting the specification's minimum size). It's generally recommended to use this generator to ensure a source of "high entropy" for the secret.
On the CLI front, a highly demanded xxhsum feature was an ability to generate XXH3 checksum values. This is achieved in v0.8.1, using the --tag format, which ensures that XXH3 results cannot be confused with (default) XXH64 ones, even though they feature the same 64-bit width.
Detailed changelist
- perf : much improved performance for XXH3streaming variants, notably ongccandmsvc
- perf : improved XXH64speed and latency on small inputs
- perf : small XXH32speed and latency improvement on small inputs of random size
- perf : minor stack usage improvement for XXH32andXXH64
- api  : new experimental variants XXH3_*_withSecretandSeed()
- api  : updated XXH3_generateSecret(), can now generate secret of any size (>= XXH3_SECRET_SIZE_MIN)
- cli  : xxhsumcan now generate and checkXXH3checksums, using command-H3
- build: can build xxhash without XXH3, with new build macroXXH_NO_XXH3
- build: fix xxh_x86dispatchbuild with MSVC, by @apankrat
- build: XXH_INLINE_ALLcan always be used safely, even afterXXH_NAMESPACEor a previousXXH_INLINE_ALL
- build: improved PPC64LE vector support, by @mpe
- install: fix pkgconfig, by @ellert
- install: compatibility with Haiku, by @Begasus
- doc : code comments made compatible with doxygen, by @easyaspi314
- misc : XXH_ACCEPT_NULL_INPUT_POINTERis no longer necessary, all functions can acceptNULLinput pointers, as long assize == 0
- misc : complete refactor of CI tests on Github Actions, offering much larger coverage, by @t-mat
- misc : xxhsumcode base split into multiple specialized units, within directorycli/, by @easyaspi314
xxHash v0.8.0 - Stable XXH3
Stable XXH3
After more than a year in the making, XXH3 has finally reached stable status, for both its 64-bit and 128-bit variants.
While the code itself was in good enough shape for production use, the generated values could still change between versions. This limited XXH3 to local sessions only.
From now on, output values produced by XXH3 for a given input and parameter set will remain identical across systems and across future versions. It makes it possible to store these values for later comparison, or to exchange them across network connections.
BSD-style checksums
Official stabilization being the main goal of this release, there are only minimal additional changes.
A notable one though is the ability for xxhsum CLI to produce and check BSD-style checksum lines, using command --tag.
One advantage of --tag format is that it explicitly specifies the algorithm and format used to represent the checksum. For example, it explicitly mentions if a checksum value follows the canonical format (XXH32) or the alternative little-endian format (XXH32_LE).
Generating BSD-style checksum lines was actually already possible, but as the CLI was unable to --check them, it remained a hidden option.
This situation changes with v0.8.0, thanks to a patch by @WayneD which makes it possible to --check BSD-style checksum lines.
Detailed list
- api : stabilize XXH3
- cli : xxhsumcan produce BSD-style lines, with command--tag
- cli : xxhsumcan parse and check BSD-style lines, using command--check, by @WayneD
- cli : xxhsum -accepts console input, requested by @jaki
- cli : xxhsumaccepts--separator, by @jaki
- cli : fix : print correct default algo for symlinked helpers, by @martinetd
- install: improved pkgconfig script, allowing custom install locations, requested by @ellert
xxHash v0.7.4 - Finalizing XXH3 and XXH128
xxHash v0.7.4 is the last evolution of xxh3 and xxh128, primarily designed to finalize the algorithm.
It is considered release candidate for v0.8.0, which means that if all goes right, this version will rebranded v0.8.0, almost "as is", within the next few weeks, after receiving sufficient feedback.
v0.8.0 is the official version after which XXH3 and XXH128 are considered "stabilized", meaning that return values will never change given the same input and seed, making the hash suitable for long-term storage and transmission.
Beyond these "final touches", the new version also brings a few notable improvements.
Automatic vector detection
x86/x64 systems can enjoy a new unit, xxh_x86dispatch, which can detect at runtime the best vector instruction set present on host system (none, sse2, avx2 or avx512), thanks to a cpu feature detector designed by @easyaspi314. It then automatically runs the appropriate vector code.
This makes it safer to deploy a single binary with advanced vector instruction sets, such as AVX2, since there is no hard requirement for all target systems to actually support it : the binary can automatically switch to SSE2 instead.
As a proof of concept, the windows builds provided alongside this release are compiled with this new capability.
AVX512 support
A new vector instruction set is supported, thanks to @gzm55 : AVX512. It can be applied on XXH3 and XXH128, using some of the most recent Intel cpus, such as IceLake on laptop. It typically offers +50% more performance compared to AVX2.
Secret Generator
Advanced users can be interested in the highly customizable variant _withSecret(), which makes it possible to run XXH3 and XXH128 algorithms using one's own secret.
However, the quality of the hash depends on the high entropy (randomness) of the secret. And sometimes, it can be difficult to ensure that the candidate secret is "random enough".
In order to produce a secret of high quality, a new function XXH3_generateSecret() is proposed in the advanced API section. It will convert any blob of bytes, named customSeed, into a high quality secret which respects all conditions expected by XXH3 and XXH128. This is true even if customSeed itself is of poor quality, such as a bunch of \0 bytes or some short or repeated common sequence.
No API modification
The existing API present in 0.7.3 has remained unchanged in 0.7.4. Any programs linking with 0.7.3 should continue to work as-is.
Note however that xxh3/xxh128 return values are not comparable across these versions.
0.7.x are labelled development versions, and should only be used for ephemeral data (hash produced and consume in the same local session).
(note : this limitation does not extend to XXH32 and XXH64, which are considered fully stable and specified).
Changelist
There are multiple smaller bug fixes and minor improvements that have been brought to this repository by great contributors. Here is a summarized list:
- perf: automatic vector detection and selection at runtime (xxh_x86dispatch.h), initiated by @easyaspi314
- perf: added AVX512support, by @gzm55
- api : new: secret generator XXH_generateSecret(), suggested by @koraa
- api : fix: XXH3_state_tis movable, identified by @koraa
- api : fix: state is correctly aligned in AVX mode (unlike malloc()), by @easyaspi314
- api : fix: streaming generated wrong values in some combination of random ingestion lengths, reported by @WayneD
- cli : fix unicode print on Windows, by @easyaspi314
- cli : can -ccheck file generated bysfv
- build: make DISPATCH=1generatesxxhsumandlibxxhashwith runtime vector detection (x86/x64 only)
- install: cygwin installation support
- doc : Cryptol specification of XXH32andXXH64, by @weaversa
xxHash v0.7.3
xxHash v0.7.3 is major evolution for xxh3 and xxh128, with a focus on speed and dispersion performance.
Speed improvements
v0.7.3 pays a lot of attention to small data, by delivering generally faster latency metrics (about +10%).
Inlining is now a first class citizen, as it is generally key to best performance on small inputs.
Among the visible changes:
- XXH_INLINE_ALLcan always be set before including- xxhash.h, even if- xxhash.hwas previously included (for example transitively, as part of a prior- *.hheader file).
- The algorithm implementation has been transferred into xxhash.h. It's no longer necessary to keep a copy ofxxhash.cin the/includedirectory for inlining to work correctly.- Note: xxhash.cstill exists, as it's useful to instantiate xxhash functions as public symbols accessible from a library or a*.oobject file. It also remains compatible with existing projects.
 
- Note: 
Large data has also received a boost, which can go up to +20% for very large samples (> many MB).
Let's underline the remarkable optimization work of @easyaspi314, who hand optimized several hot loops and instructions, and even added a new Z-vector target for s390x hardware.
No API modification
The API has remained completely stable between 0.7.2 and 0.7.3. Any programs linking with 0.7.2 should work as-is.
Note that xxh3/xxh128 results are not comparable across these versions.
New test tool
Testing a 64-bit hash algorithm for its collision rate has remained elusive for most. The sheer volume of data required to assess quality at this scale is too large for traditional test tools like SMHasher. As a general guide, it requires 4 billion hashes to reach a 50% probability of getting a single collision. Accurate collision ratio evaluation requires many more hashes to actually measure something meaningful.
A new open-source tool in tests/collisions offers this capability. It requires a lot of memory to run, with a minimum of 32 GB to measure anything significant. But provided that one has a system with enough capacity, it can accurately measure the collision ratio of any 64-bit hash algorithm.
Several algorithms were measured thanks to this tool, the result of which is currently consolidated on this wiki page. More can be added in the future.
This new development round also introduced several improvements to the SMHasher test suite, uncovering new requirements for new scenarios. This proved beneficial to improve the general dispersion qualities of xxh3 and xxh128.
Changelist
Here is a summarized list of changes for this version:
- perf: improved speed for large inputs (~+20%)
- perf: improved latency for small inputs (~10%)
- perf: s390x Vectorial code, by @easyaspi314
- cli: Improved support for Unicode filenames on Windows, thanks to @easyaspi314 and @t-mat
- api: xxhash.hcan now be included in any order, multiple times, with and withoutXXH_STATIC_LINKING_ONLYorXXH_INLINE_ALL
- build: xxHash's implementation has been transferred into xxhash.h. There is no more need to havexxhash.cin the/includedirectory forXXH_INLINE_ALLto work
- install: created pkg-config file, by @bket
- install: VCpkg installation instructions, by @LilyWangL
- doc: Highly improved code documentation, by @easyaspi314
- misc: New test tool in /tests/collisions: brute force collision tester for 64-bit hashes
xxHash v0.7.2
This a maintenance release, focused on the newer 128-bit variant.
Note that XXH3 is still labelled experimental : return values from this version are not comparable with other versions.
- Fixed collision ratio of XXH128for some specific input lengths, reported by @svpv
- Improved VSXandNEONvariants, by @easyaspi314
- Improved performance of scalar code path (XXH_VECTOR=0), by @easyaspi314
- xxhsum: can generate 128-bit hash with command- -H2(note : for experimental purposes only !- XXH128is not yet frozen)
- xxhsum: option- -qremoves status notifications
xxHash v0.7.1
The main feature of this release is an update of XXH3, building upon many user feedbacks during this test period. The main points are :
- Secret first : the algorithm computation can be altered by providing a "secret", which is any blob of bytes, of size >= XXH3_SECRET_SIZE_MIN.
- seedis still available, and acts as a secret generator
- As a consequence of these changes, note that new return values of XXH3are not compatible with v0.7.0
- updated ARM NEONvariant by @easyaspi314
- Streaming implementation is available
- Improve compatibility and performance with Visual Studio, with help from @aras-p
- Better integration when using XXH_INLINE_ALL: do not pollute host namespace, use its own macros, such asXXH_ASSERT(),XXH_ALIGN, etc.
- 128-bits variant provide helper function, for comparison of hashes.
Note that XXH3 is still considered experimental at this stage. It will have to remain stable for at least 2 releases before being branded "stable". After which stage, the algorithm and produced results will no longer evolve.
Several general improvements are also present in this release :
- Better clanggeneration ofrotlinstruction, thanks to @easyaspi314
- XXH_REROLLbuild macro, to reduce binary size, by @easyaspi314
- Improved cmakescript, by @Mezozoysky
- Full benchmark program provided in /tests/bench
xxHash v0.7.0
The main highlight of this release is the introduction of XXH3, a new hash algorithm offering much improved speed, for both large and small inputs.
XXH3 is still labelled experimental, and must be unlocked with macro XXH_STATIC_LINKING_ONLY. The source code is located into its own xxh3.h file, which is automatically included (and therefore required) by xxhash.c. It's also possible to include xxh3.h directly, which will have a similar effect as triggering XXH_INLINE_ALL.
At this stage, XXH3 is suitable for ephemeral data and tests, but avoid storing long term hash values yet.
XXH3 will be transferred into stable in a future release, after a period dedicated to gather users' feedback.
For more details on XXH3 performance, see this article.
note : there are known compilation issues under Visual Studio, which have been later fixed in dev branch.
xxHash v0.6.5
- Improved performance on small keys, thanks to suggestions from Jens Bauer
- New build macro, XXH_INLINE_ALL, extremely effective for small keys of fixed length (see this article for details)
- XXH32(): better performance on OS-X- clangby disabling auto-vectorization
- Improved benchmark measurements accuracy on small keys
- Included xxHash specification document