Skip to content

CKKS: Add CCH+23 Noise Model #1685

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ZenithalHourlyRate
Copy link
Collaborator

The noise model is mainly taken from On the precision loss in approximate homomorphic encryption. Marking as draft as it need discussion (or research).

I think the most tricky part is the multiplication, unlike in BGV/BFV variance-based noise model where the message itself often does not affect much, in CKKS the message heavily affects how the noise grows.

In short, in CKKS the message is in the form $\Delta m + e$. Then after multiplication with $\Delta m2 + e2$, the message becomes $\Delta (\Delta m m2 + m2 e + m e2 + ...)$. The $m2 e + m e2$ part is the major part of the noise, and we actually have no average way to analyse it.

Past work would argue that we can assume coefficients in message is uniform in $[-1, 1]$, however, this is hardly true in real world application, so we can not use average-case approach here.

Other works would ask the user to provide the input, which is common in practice (Openfhe has EXEC_NOISE_ESTIMATION and Lattigo has a paper with an estimator asking user input). HEIR also provides similar infrasturcture by the plaintext backend, but knowing exactly what $m$ is still makes it hard to understand the behavior of $m e2$ (past paper gives little detail on this). In the code, I just use $N |\Delta m|\infty |e2|\infty$ the worst case bound on coefficient embedding. The the bound is translated into variance by $N * N * \Delta * \Delta * B * B * variance$. This approach is still questionable. See the code comment for detail.

(sorry for not using subscript/superscript as the rendering issue with github)

Note that in https://github.com/bencrts/CKKS_noise/blob/main/heuristics/CLT.py they use $\Delta * \Delta * B * B * variance$ (note the missing $N$), which will give underestimation in my running. In the paper, the bound they use is the square of 2-norm on $m$, so N is still there, but using $N * \Delta * \Delta * B * B * variance$ still gives underestimation.

Missing parts in this PR: implement inverse canonical encoding in HEIR to actually know plaintext $m$ from the cleartext value, and integrate the noise model with the plaintext backend.

I would say the exact noise estimation for CKKS without knowing the input is open, and I would conjecture there might be no good way to do an average one without knowing the input. Even knowing the input or input domain, we might only be able to do worst-case one as the input distribution might be hand crafted to launch attack. After all, we can not ask the user to provide the input distribution. (Also some cite to the IND-CPA definition, IND-CPA-D definition and the recent application-aware security model where only the circuit and its input domain is asked to be provided). (For input distribution, some cite to differential privacy).

An experiment to give some sense on the above comment:

For multiplication of two freshly encrypted ciphertext, we consider the following case

All 0

This time $m=0$, so the noise is only $e e2$, and we can use the average-case approach by setting the new variance to $N \rho \rho 2$.

Input
  Scale: 45
  Precision lost: 2^-34.0
  Noise: 6.83
Input
  Scale: 45
  Precision lost: 2^-32.7
  Noise: 6.95
lattigo.ckks.mul
  Scale: 90
  Precision lost: 2^-67.2
  Noise: 19.30

Note that 7 + 7 + 13/2 = 20.5 where logN = 13, this is somekind of rough estimation (ignoring all the error function stuff).

All 1

The interesting part of encoding all 1 is that, the encoded message is actually $\Delta + 0 * X + 0 * X * X + ...$, namely only a constant (see #1604 (comment)). Then $m e2$ could be easily understood as a constant multiplication.

Input
  Scale: 45
  Precision lost: 2^-33.2
  Noise: 7.00
Input
  Scale: 45
  Precision lost: 2^-32.6
  Noise: 6.87
lattigo.ckks.mul
  Scale: 90
  Precision lost: 2^-32.3
  Noise: 52.38

Note that 7 + 45 = 52.

The dot product input in test example

The inputs are now [0.1, 0.2, ..., 0.8] and [0.2, 0.3, ..., 0.9].

Input
  Scale: 45
  Precision lost: 2^-32.5
  Noise: 7.01
Input
  Scale: 45
  Precision lost: 2^-32.3
  Noise: 6.99
lattigo.ckks.mul
  Scale: 90
  Precision lost: 2^-25.4
  Noise: 61.90

The prediction by the model

The model does not know the exactly value for the input for now.

Propagating 6.82 to <block argument> of type 'tensor<8xf32>' at index: 0
Propagating 6.82 to <block argument> of type 'tensor<8xf32>' at index: 1
Propagating 65.32 to %2 = arith.mulf %input0, %input1 {mgmt.mgmt = #mgmt.mgmt<level = 1, dimension = 3, scale = 90>} : tensor<8xf32>

Note that 7 + 45 + 13 = 65 where logN = 13.

I believe with all slots filled with hand-crafted value (may find construction clue in prevoius attacks), such bound could be reached (even exceeded).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant