27 Oct 16:08

IceTDrinker

2b3992f

TFHE-rs 1.4.2 Latest

Latest

Summary

TFHE-rs v1.4.2 fixes an issue where the tags were not properly propagated when using the CompressedXofKeySet.

Assets 3

20 Oct 12:41

IceTDrinker

tfhe-rs-1.4.1

fb4033e

TFHE-rs 1.4.1, tfhe-cuda-backend 0.12.0 and tfhe-hpu-backend 0.3.0

Summary

TFHE-rs v1.4.1 improves performance, adds new cryptographic capabilities, and enhances hardware support across CPU, GPU, and HPU backends.

See full details below:

CPU

Highlights

The CPU backend introduces new APIs for additional security guarantees, extended atomic pattern support, and new encrypted data handling capabilities：

Security — Introduces the ReRand feature to ensure security under the sIND-CPAᴰ model.
Extended KS32 AP support: The keyswitch 32 atomic pattern (KS32 AP) now supports compact public key encryption, keyswitching, compression, and noise squashing.
Performance: KS32 AP provides a 10–19% speedup on 64-bit integer operations.
Encrypted data handling: Adds KVStore to manipulate hashmaps in a blind way to update encrypted values.
Parameter clarity: Parameter sets are now standardized and exposed as MetaParameters.

New Features

Add MetaParameters
Add multi bit PBS support to noise squashing
Add noise squashing support for the KS32 AP
Add ciphertext compression support for the KS32 AP
Add compact public key encryption support for the KS32 AP
Add quasi-uniform OPRF over any range for tfhe::integer
Add KVStore for blind encrypted key-value updates
Add flip operation
Add ReRand primitives for sIND-CPAᴰ security
Add XOF keyset
Make FheUint/FheInt/FheBool compatible with AP params for conformance
Add missing safe_deser for ServerKey in the C API

Improvements

Improve FFT and NTT plan cache locking

Fixes

Set correct degree for noise squashed decompressed ciphertext
Avoid potential overflow for GLWE encryption on 32 bits platforms
Fix NTT plan yielding incorrect results for a class of primes
Fix scalar size check before ZK public key encryption

GPU

The GPU backend receives major performance upgrades, improved PBS techniques, and new compression and benchmarking capabilities:

Performance: All operations see 2× speedup on H100 GPUs, with certain primitives (multiplication, division, OPRF, ilog2, scalar division and multiplication) reaching 3–10× acceleration.
PBS enhancements: A new technique called "mean reduction" replaces the previous technique "drift" for classical PBS, to keep the same cryptographic parameters without the need for an additional key.
Noise squashing: Multi-bit noise squashing is introduced, providing up to 4× faster execution compared to classical PBS.
Compression: Adds support for 128-bit compression.
New benchmark: A new benchmark on GPU is introduced to perform AES encryption using FHE (in counter mode).
Parameter clarity: Parameter sets are now standardized and exposed as MetaParameters.

New Features

Add 128-bit multi-bit PBS for noise squashing
Add 128-bit compression
Add the centered modulus switch technique to reduce noise in the classical PBS
FHE encryption of AES 128 in counter mode on GPU (available in the integer API)

Improvements

Create specialized version of multi-bit pbs using thread block clusters: this results in a significant performance improvement on all operations on H100 (x2)
Improve the multi-GPU communication scheme
Use CUDA mempools to optimize memory reuse
Improve division performance on nodes with 4 GPUs or more: overall division is 4x faster than in the previous release
Improve encrypted random generation (OPRF) performance by implementing it in CUDA/C++ instead of Rust (results in 10x faster OPRF)
Improve ilog2 performance by implementing it in CUDA/C++ instead of Rust
Enable lut generation with preallocated CPU buffers to avoid some synchronizations with the CPU in comparisons
Add an assert to be sure the carry part has correct size in expand
Create message extract lut only when needed for carry propagation
Internal refactors to enhance the C++/Rust interface (pass streams and gpu indexes in a struct, pass compression data via a struct)

Fixes

Fix memory leak in multi-gpu calculations
Fix pbs128 multi-gpu bug
Fix some wrong indexes used in cuda_set_device()
Fix inconsistent types to avoid overflows
Add missing syncs when releasing scalar ops and returning trivial radix
Fix the decompression function signature in the CUDA backend

HPU

The HPU backend improves overall latency and execution throughput:

Latency reduction: Overall execution latency is reduced across all HPU operations.
Throughput increase: New SIMD operations have been added, which are further enhancing the throughput of HPU on a single V80 FPGA.

New Features

Add 400Mhz HPU v2.1 bitstream
Add ERC20_SIMD & ADD_SIMD operations
Add support of servers with multiple V80 boards (only one is used)

Improvements

Improve latency & throughput benches (HLAPI & integer) to execute some new operations and be more stable
Improve scheduling of MUL operation
Reduce a bit SW latency to push IOp and receive IOp acknowledge
In HPU v2.1 bitstream:
Compiled with Vivado 2025.1
Improved place & route (especially on reset) to reach 400Mhz
Increase bandwidth to load BSK & KSK
Improved accumulator (MMACC) structure to match PBS batch size (12)

Fixes

Stabilize HPU IOp queue
Fix a few operations (ilog2, trail0/1, ovf_mul...)

Assets 6

29 Sep 16:30

IceTDrinker

tfhe-rs-1.4.0-alpha.3

2602c9e

TFHE-rs 1.4.0-alpha.3 Pre-release

Pre-release

tfhe-rs-1.4.0-alpha.3

tfhe-rs 1.4.0-alpha.3 release

Assets 3

29 Sep 07:41

IceTDrinker

tfhe-rs-1.4.0-alpha.2

23d46ba

TFHE-rs 1.4.0-alpha.2 and tfhe-cuda-backend 0.12.0-alpha.2 Pre-release

Pre-release

tfhe-rs-1.4.0-alpha.2

tfhe-rs 1.4.0-alpha.2 release

Assets 4

26 Sep 13:21

IceTDrinker

tfhe-rs-1.4.0-alpha.1

6ca4813

TFHE-rs 1.4.0-alpha.1 and tfhe-cuda-backend 0.12.0-alpha.1 Pre-release

Pre-release

tfhe-rs-1.4.0-alpha.1

tfhe-rs 1.4.0-alpha.1 release

Assets 4

24 Sep 14:52

IceTDrinker

tfhe-versionable-0.6.2

9457ca7

tfhe-versionable 0.6.2 and tfhe-ntt 0.6.1

tfhe-versionable-0.6.2

tfhe-versionable 0.6.2 release

Assets 5

24 Sep 14:51

IceTDrinker

tfhe-rs-1.4.0-alpha.0

d60028c

TFHE-rs 1.4.0-alpha.0, tfhe-cuda-backend 0.12.0-alpha.0 and tfhe-zk-pok 0.7.3 Pre-release

Pre-release

tfhe-rs-1.4.0-alpha.0

tfhe-rs 1.4.0-alpha.0 release

Assets 5

08 Sep 07:45

nsarlin-zama

tfhe-zk-pok-0.7.2

9c0d078

tfhe-zk-pok 0.7.2

Description

This release fixes some corner cases in the four_squares algorithm used by pkev2.

Assets 3

21 Aug 08:06

nsarlin-zama

tfhe-zk-pok-0.7.1

0a28488

tfhe-zk-pok 0.7.1

Summary

This release adds a new type, curve_446::zp::ZeroizeZp that is similar to curve_446::zp::Zp but derives ZeroizeOnDrop at the cost of not being Copy.

Assets 3

11 Aug 15:15

nsarlin-zama

tfhe-rs-1.3.3

a20290c

TFHE-rs 1.3.3 and tfhe-versionable 0.6.1

Summary

This release adds some missing API:

TFHE-rs 1.3.3

Add into/from_raw_parts functions for compressed KSK material

tfhe-versionable 0.6.1

Implement Versionize/Unversionize for BTreeSet/BTreeMap

Assets 5

Releases: zama-ai/tfhe-rs

TFHE-rs 1.4.2

Summary

Uh oh!

TFHE-rs 1.4.1, tfhe-cuda-backend 0.12.0 and tfhe-hpu-backend 0.3.0

Summary

CPU

Highlights

New Features

Improvements

Fixes

GPU

New Features

Improvements

Fixes

HPU

New Features

Improvements

Fixes

Uh oh!

TFHE-rs 1.4.0-alpha.3

Uh oh!

TFHE-rs 1.4.0-alpha.2 and tfhe-cuda-backend 0.12.0-alpha.2

Uh oh!

TFHE-rs 1.4.0-alpha.1 and tfhe-cuda-backend 0.12.0-alpha.1

Uh oh!

tfhe-versionable 0.6.2 and tfhe-ntt 0.6.1

Uh oh!

TFHE-rs 1.4.0-alpha.0, tfhe-cuda-backend 0.12.0-alpha.0 and tfhe-zk-pok 0.7.3

Uh oh!

tfhe-zk-pok 0.7.2

Description

Uh oh!

tfhe-zk-pok 0.7.1

Summary

Uh oh!

TFHE-rs 1.3.3 and tfhe-versionable 0.6.1

Summary

TFHE-rs 1.3.3

tfhe-versionable 0.6.1

Uh oh!