4545 < meta property ="twitter:creator " content ="https://x.com/gvamsip " />
4646
4747 < link rel ="stylesheet " href ="https://cdn.jsdelivr.net/npm/katex@0.16.0/dist/katex.min.css ">
48- < script defer src ="https://cdn.jsdelivr.net/npm/katex@0.16.0/dist/katex.min.js "> </ script >
49- < script defer src ="https://cdn.jsdelivr.net/npm/katex@0.16.0/dist/contrib/auto-render.min.js " onload ="renderMathInElement(document.body); "> </ script >
48+ < script defer src ="https://cdn.jsdelivr.net/npm/katex@0.16.0/dist/katex.min.js "
49+ onload ="document.querySelectorAll('.math').forEach(function (el) {
50+ katex.render(el.textContent, el, {
51+ displayMode: el.classList.contains('display'),
52+ throwOnError: false
53+ });
54+ }); "> </ script >
5055
5156 < link rel ="stylesheet " type ="text/css " href ="/style.css ">
5257</ head >
@@ -117,8 +122,8 @@ <h1>The Proof is in the Pairing</h1>
117122 (exponentiations/pairings) per transaction when using
118123 pairing-based techniques. Note that this is (asymptotically)
119124 optimal, so doing much better is difficult. So if we budget
120- < span class ="math inline "> \(\ approx 1\) </ span > ms privacy
121- overhead per transaction (about the time taken for a pairing on
125+ < span class ="math inline "> \approx 1</ span > ms privacy overhead
126+ per transaction (about the time taken for a pairing on
122127 BLS12-381), processing 100K transactions requires 100s in
123128 single-core CPU time. Assuming perfect parallelization, we can
124129 of course reach 100K TPS with 100 cores—or a GPU—but this is
@@ -137,9 +142,9 @@ <h2 id="verifying-zksnarks-at-scale">Verifying zkSNARKs at
137142 < p > But this strategy comes with two limitations:</ p >
138143 < ol type ="1 ">
139144 < li > introduces an extra “hop” in each slot from proposer < span
140- class ="math inline "> \(\ rightarrow\) </ span > aggregator < span
141- class ="math inline "> \(\ rightarrow\) </ span > validator, which in
142- turn increases latency</ li >
145+ class ="math inline "> \rightarrow</ span > aggregator < span
146+ class ="math inline "> \rightarrow</ span > validator, which in turn
147+ increases latency</ li >
143148 < li > relies on availability of the aggregator to maintain the
144149 system’s throughput</ li >
145150 </ ol >
@@ -178,51 +183,50 @@ <h3 id="our-solution">Our Solution</h3>
178183 the verification algorithm:</ p >
179184 < ol type ="1 ">
180185 < li > < p > Parse the verification key < span
181- class ="math inline "> \(\ textsf{ivk} = ((A,B), [\alpha]_1,
186+ class ="math inline "> \textsf{ivk} = ((A,B), [\alpha]_1,
182187 [\beta]_1, [\delta_1]_2, [\delta_2]_2, [\tau]_2, [\delta_1
183- \tau]_2)\) </ span > , where < span class ="math inline "> \(A\) </ span >
184- and < span class ="math inline "> \(B\) </ span > form the square R1CS
188+ \tau]_2)</ span > , where < span class ="math inline "> A </ span > and
189+ < span class ="math inline "> B </ span > form the square R1CS
185190 relation.</ p > </ li >
186- < li > < p > Parse the proof < span class ="math inline "> \(\ pi = (T, U,
191+ < li > < p > Parse the proof < span class ="math inline "> \pi = (T, U,
187192 v_a, v_b) \in \mathbb{G}_1^2 \times
188- \mathbb{F}^2\) </ span > .</ p > </ li >
189- < li > < p > Compute the challenge < span class ="math inline "> \( r :=
190- H(\textsf{transcript})\) </ span > .</ p > </ li >
191- < li > < p > Compute < span class ="math inline "> \( x_A := A \cdot (x \|
192- 0)\) </ span > and < span class ="math inline "> \( x_B := B \cdot (x \|
193- 0)\) </ span > , and interpolate over domain < span
194- class ="math inline "> \(K\) </ span > to obtain polynomials < span
195- class ="math inline "> \(\ hat{x}_A\) </ span > and < span
196- class ="math inline "> \(\ hat{x}_B\) </ span > .</ p > </ li >
193+ \mathbb{F}^2</ span > .</ p > </ li >
194+ < li > < p > Compute the challenge < span class ="math inline "> r :=
195+ H(\textsf{transcript})</ span > .</ p > </ li >
196+ < li > < p > Compute < span class ="math inline "> x_A := A \cdot (x \|
197+ 0)</ span > and < span class ="math inline "> x_B := B \cdot (x \|
198+ 0)</ span > , and interpolate over domain < span
199+ class ="math inline "> K </ span > to obtain polynomials < span
200+ class ="math inline "> \hat{x}_A</ span > and < span
201+ class ="math inline "> \hat{x}_B</ span > .</ p > </ li >
197202 < li > < p > Compute the quotient evaluation, where < span
198- class ="math inline "> \(z_K\)</ span > is the vanishing polynomial
199- on the domain < span class ="math inline "> \(K\)</ span > :</ p >
200- < p > < span class ="math display "> \[v_q := \frac{(v_a +
201- \hat{x}_A(r))^2 - (v_b +
202- \hat{x}_B(r))}{z_K(r)}\]</ span > </ p > </ li >
203+ class ="math inline "> z_K</ span > is the vanishing polynomial on
204+ the domain < span class ="math inline "> K</ span > :</ p >
205+ < p > < span class ="math display "> v_q := \frac{(v_a +
206+ \hat{x}_A(r))^2 - (v_b + \hat{x}_B(r))}{z_K(r)}</ span > </ p > </ li >
203207 < li > < p > Check the pairing equation:</ p >
204- < p > < span class ="math display "> \[ e(T,\; [\delta_2]_2)
208+ < p > < span class ="math display "> e(T,\; [\delta_2]_2)
205209 \stackrel{?}{=} e(U,\; [\delta_1 \tau]_2 - r \cdot [\delta_1]_2)
206210 \cdot e(v_a \cdot [\alpha]_1 + v_b \cdot [\beta]_1 + v_q \cdot
207- [1]_1,\; [1]_2)\] </ span > </ p > </ li >
211+ [1]_1,\; [1]_2)</ span > </ p > </ li >
208212 </ ol >
209213 < p > Rearranging the last equation, we have:</ p >
210- < p > < span class ="math display "> \[ e(\colorbox{lightgrey}{$T$},\;
214+ < p > < span class ="math display "> e(\colorbox{lightgrey}{$T$},\;
211215 [\delta_2]_2) \stackrel{?}{=} e(\colorbox{lightgrey}{$U$},\;
212216 [\delta_1 \tau]_2) \cdot e(\colorbox{lightgrey}{$-r \cdot U$},\;
213217 [\delta_1]_2) \cdot e(\colorbox{lightgrey}{$v_a$} \cdot
214218 [\alpha]_1 + \colorbox{lightgrey}{$v_b$} \cdot [\beta]_1 +
215- \colorbox{lightgrey}{$v_q$} \cdot [1]_1,\; [1]_2)\] </ span > </ p >
219+ \colorbox{lightgrey}{$v_q$} \cdot [1]_1,\; [1]_2)</ span > </ p >
216220 < p > where only the < span
217- class ="math inline "> \(\ colorbox{lightgrey}{\text{highlighted}}\) </ span >
221+ class ="math inline "> \colorbox{lightgrey}{\text{highlighted}}</ span >
218222 terms change across different proofs (under the same
219223 verification key). Thus, we can batch verify multiple proofs
220224 (see < a href ="https://eprint.iacr.org/2008/015.pdf "> FGHP09</ a > )
221225 by taking a random linear combination using three < span
222- class ="math inline "> \(\ mathbb{G}_1\) </ span > MSMs for the < span
223- class ="math inline "> \( T, U, r\cdot U\) </ span > terms, field
224- multiplications for the < span class ="math inline "> \( v_a, v_b,
225- v_q\) </ span > terms, and finally checking a single multi-pairing.
226+ class ="math inline "> \mathbb{G}_1</ span > MSMs for the < span
227+ class ="math inline "> T, U, r\cdot U</ span > terms, field
228+ multiplications for the < span class ="math inline "> v_a, v_b,
229+ v_q</ span > terms, and finally checking a single multi-pairing.
226230 Of course, we still need to carry out steps 1-5 for each proof,
227231 but these are very fast hashing and field operations. This
228232 strategy applies more broadly to KZG opening proofs and
@@ -232,14 +236,13 @@ <h3 id="our-solution">Our Solution</h3>
232236 < p > A quick implementation (with room for further optimization)
233237 of this idea can be found < a
234238 href ="https://github.com/guruvamsi-policharla/garuda-pari/pull/1 "> here</ a > ,
235- and it shows a < span class ="math inline "> \(60\times\)</ span >
236- speedup when verifying < span
237- class ="math inline "> \(2^{16}\)</ span > proofs relative to naive
238- individual verification. Both experiments were run in
239- single-threaded mode. Concretely, this amounts to < span
240- class ="math inline "> \(\approx 10\mu\text{s}\)</ span > per proof
241- on an M5 MacBook Pro, down from 0.6 ms per proof. And the more
242- proofs you verify, the faster it gets!</ p >
239+ and it shows a < span class ="math inline "> 60\times</ span > speedup
240+ when verifying < span class ="math inline "> 2^{16}</ span > proofs
241+ relative to naive individual verification. Both experiments were
242+ run in single-threaded mode. Concretely, this amounts to < span
243+ class ="math inline "> \approx 10\mu\text{s}</ span > per proof on an
244+ M5 MacBook Pro, down from 0.6 ms per proof. And the more proofs
245+ you verify, the faster it gets!</ p >
243246 < div data-align ="center ">
244247 < table >
245248 < thead >
@@ -289,7 +292,7 @@ <h3 id="our-solution">Our Solution</h3>
289292 proportionally, verifying a SNARK can be cheaper than an ERC-20
290293 transfer. A parallel implementation using 8 threads is able to
291294 verify over 500K proofs in < span
292- class ="math inline "> \( <750\) </ span > ms.</ p >
295+ class ="math inline "> <750</ span > ms.</ p >
293296 < div data-align ="center ">
294297 < table >
295298 < thead >
@@ -340,8 +343,8 @@ <h3 id="our-solution">Our Solution</h3>
340343 extended to support standard R1CS circuits (see Remark 2.3 in < a
341344 href ="https://eprint.iacr.org/2024/1245 "> DMS24</ a > ) by
342345 increasing the proof size to 2 < span
343- class ="math inline "> \(\ mathbb{G}_1\) </ span > elements and 3 < span
344- class ="math inline "> \(\ mathbb{F}\) </ span > elements.</ p >
346+ class ="math inline "> \mathbb{G}_1</ span > elements and 3 < span
347+ class ="math inline "> \mathbb{F}</ span > elements.</ p >
345348 < p > Stay tuned for our upcoming posts on scaling timelock
346349 encryption and batched threshold encryption for encrypted
347350 mempools!</ p >
0 commit comments