| Field | Value |
|---|---|
| Paper | Communication-Computation Trade-offs in PIR (2019) |
| Archetype | Comparison / multi-paradigm |
| PIR Category | Group X — Extensions |
| Security model | Semi-honest single-server CPIR (all constructions); SPIR extension in Appendix E.1 |
| Additional assumptions | RLWE (SealPIR/MulPIR); Phi-hiding / safe primes (Gentry-Ramzan) |
| Correctness model | Deterministic (FHE noise budget managed by parameter selection for HE schemes; exact arithmetic for Gentry-Ramzan) |
| Rounds (online) | 1 (non-interactive) for all constructions |
| Record-size regime | 288B (Pung-style), 8kB, 2MB, 40kB, 307kB benchmarked |
Lineage ⤴
| Field | Value |
|---|---|
| Builds on | SealPIR (Angel et al., S&P 2018, Group 1a); XPIR (Aguilar-Melchor et al., 2016, Group 1a); Gentry-Ramzan PIR (ICALP 2005); Chor et al. keyword PIR (1998); Stern's d-dimensional recursion (1998); Borodin-Moenck modular interpolation (1974) |
| What changed | (1) Optimizes SealPIR communication by 75-80% via secret-key encryption, modulus switching, and new oblivious expansion; (2) Introduces MulPIR, which replaces additive-only recursion with multiplicative homomorphism to reduce communication for large entries; (3) Improves Gentry-Ramzan with fast modular interpolation and client-aided exponentiation, reducing server computation by up to 85%; (4) Introduces keyword PIR via simple hashing and cuckoo hashing for sparse databases |
| Superseded by | OnionPIR (2021, builds on MulPIR with external products); Spiral (2022, further optimizes FHE-based PIR communication); SimplePIR/DoublePIR (2023, lattice-based with preprocessing) |
| Concurrent work | SHECS-PIR (Park and Tibouchi, ESORICS 2020) — also uses GSW-style multiplicative homomorphism for PIR 1 |
Core Idea ⤴
This paper presents a systematic study of communication-computation trade-offs across three fundamentally different PIR paradigms: additive HE (SealPIR), somewhat HE with multiplicative homomorphism (MulPIR), and number-theoretic (Gentry-Ramzan). 2 The central insight is that no single construction dominates across all settings — the optimal choice depends on database shape (number and size of entries) and the relative costs of computation versus communication. 3
For SealPIR, the paper achieves "for free" (near-zero computational overhead) communication reductions of up to 80% through three techniques: (1) compressing the upload by using secret-key encryption instead of public-key encryption (halving upload size), (2) compressing the download via modulus switching (reducing by a factor of approximately 2.4x), and (3) a new oblivious expansion algorithm that is linear over the plaintext space and further reduces upload communication. 4
For MulPIR, the paper introduces a PIR construction that uses both additive and multiplicative homomorphism of the FV scheme during recursion. Instead of the response growing exponentially in F (the ciphertext expansion factor) with each recursion level as in SealPIR, MulPIR performs one homomorphic multiplication per recursion step, yielding response size O(c(n)) where c(d) is the size of a ciphertext supporting d successive multiplications — trading higher server computation for dramatically smaller communication for databases with large entries. 5
For Gentry-Ramzan PIR, the paper provides the first practically efficient implementation by employing fast modular interpolation (Õ(n log^2 n) via Borodin-Moenck) for server setup and a novel client-aided variant where the client precomputes powers of the generator, enabling a tunable trade-off between communication and server computation that can reduce computation by up to 85%. 6
Finally, the paper introduces keyword PIR for sparse databases via two hashing-based constructions (simple hashing and cuckoo hashing), enabling PIR with server computation proportional to actual database size rather than the index domain size. 7
Cross-Paradigm Comparison ⤴
| Approach | Paradigm | Upload (kB) | Download (kB) | Server Time (ms) | Server Cost (US cents) | Best For |
|---|---|---|---|---|---|---|
| SealPIR [5] (d=2) | Additive HE (FV/BFV) | 61.4 | 307 | 1,185 (Expand+Response) | 0.0040 | Moderate entries, bandwidth-constrained settings |
| Optimized SealPIR (d=2) | Additive HE (FV/BFV) | 15.4 | 128 | 884 (Expand+Response) | 0.0028 (approximate) | Same as SealPIR but with better communication |
| MulPIR (d=2) | Somewhat HE (FV multiplicative) | 119 | 119 | 1,174 (Expand+Response) | 0.0026 | Large entries (8kB+); balanced upload/download 8 |
| Gentry-Ramzan (1 gen.) | Number-theoretic (CRT + discrete log) | 0.5 | 1.3 | 54,125 (Setup+Response) | 0.0145 | Minimal communication; small DB; cost-insensitive 9 |
| Client-Aided GR (50 gen.) | Number-theoretic (CRT + Straus multi-exp) | 13.1 | 1.3 | 6,654 (Setup+Response) | 0.0011 | Small entries; client willing to precompute 10 |
| ElGamal | Additive HE (NIST P-224r1) | 280 | 8 | 55,044 (Setup+Response) | 0.0091 | Baseline comparison only |
| Damgard-Jurik (s=1) | Additive HE (Paillier generalization) | 1,480 | 0.6 | 120,636 (approximate, Setup+Response) | 0.0382 | N/A in practice (too slow) |
All values from Table 5 (p. 13) for the 1MB database (5,000 elements of 288B) without recursion. Server costs from Table 3 (p. 12) for 288B entries with n=2^20 use recursion d=2.
Cryptographic Foundation ⤴
| Layer | Detail |
|---|---|
| Hardness assumption | RLWE (Ring Learning With Errors) |
| Encryption scheme | Fan-Vercauteren (FV/BFV) via SEAL library |
| Ring / Field | R = Z[x]/(x^N + 1); ciphertexts in (R/qR)^2; plaintexts in R/tR |
| Parameters (SealPIR) | N = 2048, modulus q = 60 bits, plaintext modulus t = 2^12 + 1, 115-bit security 11 |
| Parameters (MulPIR) | N = 8192, modulus q = 160 bits (with 50 + 2 * 55 bit modulus switching), plaintext modulus t = 2^20 + 2^19 + 2^17 + 2^16 + 2^14 + 1, 180-bit security 12 |
| Key operations | Addition, scalar multiplication, ciphertext multiplication (MulPIR only in recursion), substitution (for oblivious expansion) |
| Expansion factor | F = 2 log(q) / log(t); SealPIR: F approximately 10; MulPIR: uses modulus switching to reduce effective F to approximately 4 13 |
| Layer | Detail |
|---|---|
| Hardness assumption | Phi-hiding assumption (difficulty of deciding whether a prime divides phi(m) for composite m); discrete logarithm in Z_m* |
| Mathematical structure | Chinese Remainder Theorem (CRT) representation of the database as a single large integer E, where E = D_i mod pi_i for pairwise coprime moduli pi_i >= 2^l 14 |
| Group structure | Z_m* where m = Q_1 * Q_2, with Q_1 = 2q_1 + 1, Q_2 = 2q_2 * pi_k + 1 safe primes; generator g of subgroup of order q_1 * q_2 * pi_k 15 |
| Server computation | Modular exponentiation: g' = g^E mod m |
| Client decryption | Discrete logarithm via Pohlig-Hellman algorithm in Z_{pi_k} |
| Scheme | Type | Expansion Factor | Key Feature |
|---|---|---|---|
| ElGamal | Additive HE | >= 2 | NIST P-224r1 curve; small plaintext, fast decryption via baby-step-giant-step DLP 16 |
| Damgard-Jurik | Additive HE | >= 1 + 1/s (approaches 1) | RSA modulus N = pq; parameterized by s; uniquely achieves F close to 1, simplifying recursion 17 |
| FV (as SHE) | Somewhat HE | >= 2 | Supports bounded-depth multiplication; used by MulPIR |
Protocol Phases ⤴
| Phase | Actor | Operation | Communication | Notes |
|---|---|---|---|---|
| Query | Client | Generate selection vectors; encode into polynomials using secret-key FV encryption; normalization (2^{-c} mod t) folded into query 18 | d * ceil(d*m / 2^c) ciphertexts upward | Secret-key encryption halves upload vs public-key |
| Expand | Server | Obliviously expand each query ciphertext into selection vectors using Algorithm 7 (new expansion, linear over plaintext space) 19 | -- (internal) | Simplified: each ciphertext expands independently |
| Response | Server | Recursive inner products + modulus switching on intermediate ciphertexts | Single ciphertext (after modulus switching) | Modulus switching reduces download by approximately 2.4x |
| Extract | Client | Decrypt using secret key | -- | Standard FV decryption |
| Phase | Actor | Operation | Communication | Notes |
|---|---|---|---|---|
| Query | Client | Same as optimized SealPIR (Algorithm 6) | ceil(d*m / 2^c) ciphertexts upward | Same upload as optimized SealPIR |
| Expand | Server | Oblivious expansion (Algorithm 7) | -- (internal) | Same expansion as optimized SealPIR |
| Response (recursion) | Server | At each recursion level: homomorphic multiplication of ciphertexts (not additive layering) 20 | -- (internal) | Key difference: uses ct-ct multiplication per recursion level |
| Modulus switch | Server | Apply modulus switching to compress response ciphertexts | Single ciphertext downward | Effective F approximately 4 after modulus switching |
| Extract | Client | Decrypt; parse database elements from FV plaintext | -- | Larger plaintext space packs more data |
| Phase | Actor | Operation | Communication | Notes |
|---|---|---|---|---|
| Setup (offline) | Server | Encode database D via CRT: compute E such that E = D_i mod pi_i; use fast modular interpolation (Borodin-Moenck, Õ(n log^2 n)) 21 | -- | One-time; reusable across queries if DB unchanged |
| Query | Client | Generate safe primes Q_1, Q_2; compute m = Q_1 * Q_2; sample generator g of appropriate subgroup | (m, g) sent to server: 3 * λ_GR bits 22 | Client must generate large primes (expensive) |
| Response | Server | Compute g' = g^E mod m (modular exponentiation with exponent E) | g' in Z_m*: single group element | Server computation dominates |
| Extract | Client | Compute h = g'^{q_2}, h' = g'^{q_1*q_2}; solve for d via Pohlig-Hellman in Z_{pi_k} | -- | DLP in small subgroup is efficient |
| Phase | Actor | Operation | Communication | Notes |
|---|---|---|---|---|
| Query | Client | Same as GR but additionally precompute l+1 powers: g, g^b, g^{b^2}, ..., g^{b^l} using Euler's theorem to reduce exponents 23 | (m, g, g^b, ..., g^{b^l}) to server | Extra l group elements in upload |
| Response | Server | Rewrite E in base b; compute product of multi-exponentiations using Straus's algorithm 24 | g' in Z_m*: single group element | Server computation reduced proportional to number of generators |
| Extract | Client | Same as standard GR | -- | -- |
| Phase | Actor | Operation | Communication | Notes |
|---|---|---|---|---|
| Pre-processing | Server | Hash each (key, data) pair into one of m bins using public hash function H | H sent to client | Bucket size grows with n |
| Query | Client | Compute H(i) for query key i; run PIR.Query(H(i)) on the bin database | PIR query for single bucket | Works well with MulPIR (large plaintext absorbs bucket) |
| Response | Server | Run PIR.Response on the queried bucket | PIR response | -- |
| Extract | Client | Decrypt; search bucket for desired key | -- | -- |
| Phase | Actor | Operation | Communication | Notes |
|---|---|---|---|---|
| Pre-processing | Server | Build cuckoo hash table with kappa hash functions (H_1, ..., H_kappa); each item placed in exactly one of kappa locations 25 | Hash functions sent to client | Table size proportional to |
| Query | Client | Compute H_j(i) for j in [kappa]; run kappa PIR queries (one per hash function) | kappa PIR queries | Constant number of queries regardless of DB size |
| Response | Server | Run kappa PIR responses | kappa PIR responses | -- |
| Extract | Client | Decrypt all kappa responses; output item if found, else bottom | -- | Correctness guaranteed by cuckoo hashing properties |
Complexity ⤴
| Metric | SealPIR [5] | Optimized SealPIR | MulPIR (d=2) | Phase |
|---|---|---|---|---|
| Upload size | 61.4 kB | 15.4 kB | 122 kB | Online |
| Download size | 307 kB | 128 kB | 119 kB | Online |
| Total communication | 368.4 kB | 143.4 kB | 241 kB | Online |
| Client Query (ms) | 19 | 19 | 172 | Online |
| Server Expand (ms) | 145 | 294 | 391 | Online |
| Server Response (ms) | 1,020 | 590 (approximate) | 1,919 | Online |
| Server Cost (US cents) | 0.0033 | 0.0028 (approximate) | 0.0026 | Online |
Values from Table 3 (p. 12), n = 262,144 (= 2^18) column; for n = 2^20 see the 1,048,576 columns. 26
| Metric | GR (1 gen.) | Client-Aided GR (15 gen.) | Client-Aided GR (50 gen.) | Client-Aided GR (100 gen.) |
|---|---|---|---|---|
| Upload (kB) | 0.5 | 4.1 | 13.1 | 25.8 |
| Download (kB) | 1.3 | 1.3 | 1.3 | 1.3 |
| S.Setup (ms) | 1,532 | 1,540 | 1,594 | 1,796 |
| C.Create (ms) | 3,294 | 2,688 | 3,966 | 7,980 |
| S.Respond (ms) | 51,803 | 5,495 | 2,988 | 2,904 |
| Server Cost (US cents) | 0.0145 | 0.0016 | 0.0011 | 0.0014 |
Values from Table 5 (p. 13), 1MB database row. 27
Asymptotic communication complexity (Table 2, p. 8; Gentry-Ramzan from Section 4 and Table 8, p. 20)
| HE Type | Recursion d (1 <= d <= log n) | Recursion d = log n | Computation Cost |
|---|---|---|---|
| Additive HE | O(d*n^{1/d} + F^{d-1}) ciphertexts | O(log n + F^{log n - 1}) | n(A + S) to n^{1/d} * F^{d-1} * (A + S) |
| Somewhat HE (MulPIR) | O(d*n^{1/d}) ciphertexts | O(log n) | n(A + S) + n^{(d-1)/d} * M |
| Fully HE | -- | O(log n) | n * log(n) * M + n(A + S + M) |
| Gentry-Ramzan | O(λ_GR) (constant!) | N/A | 2 * n * pt multiplications of λ_GR-bit numbers |
Where A = addition, S = scalar multiplication, M = ciphertext multiplication. 28
| Entry size | Scheme | Upload (kB) | Download (kB) |
|---|---|---|---|
| 288B | SealPIR [5] | 61.4 | 307.2 |
| 288B | Optimized SealPIR (w/o Remark 1) | 15.4 | 128 |
| 288B | Optimized SealPIR (w/ Remark 1) | 15.4 | 64 |
| 288B | MulPIR | 119 | 119 |
| 8kB | SealPIR [5] | 61.4 | 921 |
| 8kB | Optimized SealPIR (w/o Remark 1) | 15.4 | 384 |
| 8kB | MulPIR | 119 | 119 |
| 2MB | SealPIR [5] | 61.4 | 200,294 |
| 2MB | Optimized SealPIR (w/o Remark 1) | 15.4 | 83,456 |
| 2MB | MulPIR | 119 | 13,660 |
MulPIR's download advantage grows dramatically with entry size because its response does not scale with F^{d-1}. 29
Performance Benchmarks ⤴
All experiments on a virtual machine with Intel Xeon E5-2695 v3 @ 2.30 GHz, 128 GB RAM, Debian. Monetary costs computed using Google Cloud Platform prices: 1 cent per CPU-hour and 8 cents per GB of internet traffic. 30
| Database size n | 262,144 | 1,048,576 | 4,194,304 |
|---|---|---|---|
| SealPIR [5] (d=2) | |||
| Client Query (ms) | 19 | 19 | 19 |
| Server Expand (ms) | 145 | 294 | 590 |
| Server Response (ms) | 1,020 | 3,520 | 12,891 |
| Upload (kB) | 61.4 | 61.4 | 61.4 |
| Download (kB) | 307 | 307 | 307 |
| Server Cost (US cents) | 0.0033 | 0.0040 | 0.0067 |
| MulPIR (d=2) | |||
| Client Query (ms) | 172 | 192 | 213 |
| Server Expand (ms) | 391 | 783 | 1,610 |
| Server Response (ms) | 1,919 | 5,213 | 16,307 |
| Upload (kB) | 122 | 122 | 122 |
| Download (kB) | 119 | 119 | 119 |
| Server Cost (US cents) | 0.0026 | 0.0036 | 0.0069 |
MulPIR achieves lower server cost than SealPIR for most database sizes due to its lower communication costs, despite higher computation. 31
| Metric | Optimized SealPIR | MulPIR |
|---|---|---|
| Client Query (ms) | 42 | 263 |
| Server Expand (ms) | 357 | 3,560 |
| Server Response (ms) | 47,712 | 52,280 |
| Upload (kB) | 15 | 119 |
| Download (kB) | 1,792 | 238 |
| Server Cost (US cents) | 0.028 | 0.018 |
For large entries, MulPIR reduces communication of SealPIR by 7x at similar computation cost, reducing total monetary server cost by 35%. 32
| Scheme | # chunks | Upload (kB) | Download (kB) | S.Setup (ms) | S.Respond (ms) | Server Cost (US cents) |
|---|---|---|---|---|---|---|
| MulPIR | 100 | 79.4 | 1,385 | 88,815 | 34,388 | 0.0417 |
| Client-Aided GR (50 gen.) | 4,955 | 13.1 | 1,259 | 1,347,036 | 28,684 | 0.0016 (approximate) |
| Damgard-Jurik (s=1) | 1,060 | 2,960 | 614 | approximately 80,000 | approximately 3,200 | 11.7451 |
| ElGamal | 76,800 | 280 | 4,300 | approximately 300 | approximately 88,800 | 1.4338 |
Values from Table 5 (p. 13). Timings marked with "approximately" are estimated on a smaller number of chunks. 33
| Bucket size | GR Comm. (kB) | GR Client (ms) | GR Server (ms) | GR Server Cost | MulPIR Comm. (kB) | MulPIR Client (ms) | MulPIR Server (ms) | MulPIR Server Cost |
|---|---|---|---|---|---|---|---|---|
| 10k | 10.4 | 24,324 | 317 | 0.00017 | 90.5 | 156 | 475 | 0.00086 |
| 50k | 10.4 | 24,906 | 1,649 | 0.00054 | 90.5 | 195 | 810 | 0.00095 |
| 200k | 10.4 | 30,644 | 2,774 | 0.00085 | 90.5 | 195 | 830 | 0.00095 |
| 1M | 10.4 | 49,819 | 31,055 | 0.0087 | 90.5 | 265 | 3,742 | 0.0018 |
GR achieves best communication (10.4 kB constant) but client computation is orders of magnitude higher. MulPIR outperforms GR in total server cost for bucket sizes >= 200k. 34
Novel Constructions and Techniques ⤴
The key insight is that oblivious expansion (Algorithm 5 from SealPIR) is linear over the plaintext space: if the input ciphertext encrypts m = sum(m_i * x^i), then expansion produces N ciphertexts, each encrypting the corresponding m_i. 35 This linearity enables two optimizations:
- Normalization in query: The 2^{-c} mod t normalization factor is folded into the Query algorithm (Algorithm 6, line 4) rather than applied server-side after expansion (as in SealPIR's Algorithm 5, lines 20-22). This avoids n scalar multiplications on ciphertexts.
- Concatenated expansion: Instead of d independent expansions of ceil(m/2^c) ciphertexts each, all d selection vectors are concatenated and encrypted in ceil(d*m/2^c) ciphertexts total, simplifying the expansion (Algorithm 7 vs Algorithm 5).
MulPIR replaces the additive-HE-based recursion in SealPIR with ciphertext-ciphertext multiplication at each recursion level. 36 In additive recursion, each recursion level produces F plaintext elements per intermediate ciphertext, causing the response to grow as F^{d-1} ciphertexts. In MulPIR, the server performs one homomorphic multiplication per level, keeping the response to a single ciphertext (of size c(d) supporting d multiplications). This requires parameters (N, q) large enough to support the multiplicative depth d, making individual operations more expensive but dramatically reducing total communication for large entries.
The CRT encoding E = sum(D_k * a_k * M_k) naively requires Omega(n^2) time because each M_k = M / pi_k has size Omega(n). The paper applies Borodin-Moenck's divide-and-conquer modular interpolation: split the set of moduli in half, factor out the product of each half, and recurse. Using Schonhage-Strassen integer multiplication, total time is Õ(n * log^2(n) * log(log(n))). 37 The supermoduli M_1, M_2 and inverses a_k can be precomputed and reused across queries when the moduli set remains the same.
The server rewrites the exponent E in base b >= 2: E = E_0 + E_1b + E_2b^2 + ... + E_l*b^l. The client precomputes g, g^b, g^{b^2}, ..., g^{b^l} using Euler's theorem to efficiently reduce exponents modulo phi(m). The server then computes g^E = g^{E_0} * (g^b)^{E_1} * ... * (g^{b^l})^{E_l} as a product of l+1 multi-exponentiations using Straus's algorithm. 38 The security argument is that revealing powers of g leaks no information since the server could compute them itself (just not as fast, since the server does not know phi(m)).
Keyword PIR for Sparse Databases (Section 5) ⤴
Standard PIR assumes dense indices [1, n], but many real-world databases are sparse: the database has |D| elements indexed by keys from a much larger domain. Keyword PIR aims for server computation proportional to |D|, not the domain size. 39
Server selects hash function H mapping keys to m bins. Each database element (i, d) is placed in bin H(i). Client queries for key i by running PIR on the bucket at H(i). Bucket size grows as O(|D| / m), which works well with MulPIR (large plaintext space absorbs large buckets) but poorly with Gentry-Ramzan (which wants small plaintexts).
Server builds a cuckoo hash table with kappa hash functions, guaranteeing each element is in exactly one of kappa possible locations. Table size is proportional to |D| with constant multiplicative overhead. Client makes kappa PIR queries (one per hash function). This construction ensures each bucket contains at most one element, making it ideal for Gentry-Ramzan (small plaintext) and enabling CRT batching of the kappa queries into a single Gentry-Ramzan query. 40
Comparison with Prior Work ⤴
| Metric | MulPIR | ElGamal | Damgard-Jurik (s=1) | Gentry-Ramzan (1 gen.) | Client-Aided GR (50 gen.) |
|---|---|---|---|---|---|
| Upload (kB) | 14 | 280 | 1,480 | 0.5 | 13.1 |
| Download (kB) | 21 | 8 | 0.6 | 1.3 | 1.3 |
| Total comm. (kB) | 35 | 288 | 1,480.6 | 1.8 | 14.4 |
| S.Setup (ms) | 39 | 29 | 40,636 | 1,532 | 1,594 |
| S.Respond (ms) | 3,910 | 893 | 20,710 | 51,803 | 2,988 |
| Server Cost (US cents) | 0.0019 | 0.0091 | 0.0382 | 0.0145 | 0.0011 |
Key findings: Client-Aided GR (50 generators) achieves the lowest server cost for small entries. MulPIR achieves the best server cost among HE-based schemes. Damgard-Jurik and ElGamal are significantly slower on the client side. 41
- Optimized SealPIR reduces upload by 75% (61.4 kB to 15.4 kB) and download by up to 80% (307 kB to 64 kB with Remark 1 optimization) at essentially no additional computation cost 42
- MulPIR further reduces download for large entries: at 2MB entries, MulPIR download is 13,660 kB vs SealPIR's 200,294 kB (14.7x reduction) 43
For databases where downloading everything is viable, PIR becomes worthwhile when communication cost exceeds computation cost. At Google Cloud Platform prices ($0.01/CPU-hour, $0.08/GB), the paper shows PIR is cost-effective for databases above a few thousand entries.
Key Tradeoffs & Limitations ⤴
-
MulPIR computation vs communication: MulPIR achieves dramatically better communication for large entries but requires larger parameters (N=8192 vs 2048) and costlier ciphertext multiplication, yielding approximately 2x higher server computation than SealPIR for 288B entries. 44
-
Gentry-Ramzan prime generation bottleneck: Each query requires generating fresh safe primes (Q_1, Q_2) of λ_GR bits, which takes 3-50 seconds on the client depending on the application — impractical for latency-sensitive applications. 45
-
Recursion depth d=2 is optimal in practice: For both SealPIR and MulPIR, d=3 does not improve communication or computation over d=2, because the upload already consists of a single ciphertext at d=2. 46
-
Entry size determines optimal scheme: For small entries (288B), additive HE (SealPIR) with d=2 achieves the lowest monetary cost among HE schemes. For large entries (40kB+), MulPIR dominates. For minimal communication with small databases, client-aided Gentry-Ramzan wins. 47
-
Keyword PIR hashing tradeoff: Simple hashing produces large buckets (good for MulPIR, bad for GR); cuckoo hashing produces unit-size buckets with kappa queries (good for GR with CRT batching, adds kappa multiplicative overhead for HE schemes). 48
-
Modulus switching tradeoff (Remark 1): Applying modulus switching at each recursion level further reduces download by a factor of approximately log_2(p)/log_2(q) but at the cost of n^{1/2} additional modular reductions, significantly increasing computation. 49
-
No malicious security: All constructions assume semi-honest servers. The SPIR extension (Appendix E.1) provides database privacy via OPRFs but does not handle malicious servers.
Implementation Notes ⤴
- Language / Libraries: C++ using SEAL 3.2.0 (SealPIR) and SEAL 3.5.4 (MulPIR); OpenSSL for BigNum and elliptic curve operations (ElGamal, Damgard-Jurik, Gentry-Ramzan) 50
- SealPIR implementation: Based on Microsoft's open-source SealPIR (https://github.com/microsoft/SealPIR)
- Polynomial arithmetic: NTT-based (via SEAL)
- CRT batching for GR: Uses Groth et al. [36] technique to batch kappa cuckoo hash queries into a single GR query (Appendix E.2)
- Parallelism: Single-threaded benchmarks throughout
- Precomputation: GR supermoduli M_1, M_2 and inverses a_k are precomputed and reusable when moduli set is unchanged
Open Problems ⤴
-
Closing the communication gap: MulPIR achieves O(log n) communication asymptotically but the constants (large N, q for multiplicative depth) remain high. Can parameters be reduced while maintaining security? 51
-
Practical Gentry-Ramzan at scale: The fast modular interpolation makes setup feasible, but server response (modular exponentiation of a huge integer) remains a bottleneck for large databases. The paper demonstrates viability for approximately 50k entries but not beyond. 52
-
Reducing client cost in Gentry-Ramzan: Prime generation dominates client time. Can safe primes be precomputed or amortized across queries?
-
Better keyword PIR constructions: The simple hashing and cuckoo hashing approaches each have limitations (large buckets vs multiple queries). Can a single construction achieve constant queries with constant-size buckets?
-
Combining paradigms: The paper shows different schemes win in different regimes. A system that dynamically selects the optimal scheme based on database parameters and network conditions could provide uniformly good performance.
-
Beyond semi-honest security: Extending the constructions to handle malicious servers, particularly for the password checkup application where server integrity is important.
Uncertainties ⤴
-
Exact parameter selection for MulPIR: The paper states "polynomial of dimension 8192 with 50 + 2 * 55 bit modulus, modulus switching to 50 bits, and plaintext modulus t = 2^20 + 2^19 + 2^17 + 2^16 + 2^14 + 1" (Table 1 footnote, p. 7) but does not fully justify why this specific plaintext modulus was chosen or how it was determined to support exactly the required multiplicative depth.
-
Gentry-Ramzan security level: The paper uses λ_GR-bit primes but does not specify the exact security level achieved for the benchmarked parameter sets. The 2048-bit modulus and block size of 500 are mentioned (Section 6.2, p. 11) but the concrete security guarantee is not stated.
-
Remark 1 cost estimation: The paper notes that the Remark 1 optimization (modulus switching at each recursion level) is "computation-expensive" and omits it from fair comparisons (footnote 4, p. 11). The exact computational overhead is not quantified.
-
Approximate values in Table 5: Timings for Damgard-Jurik and ElGamal on the 3GB database are explicitly marked as "approximately" estimated from a smaller number of chunks (Table 5 footnote, p. 13).
-
Naming convention: The filename references "SealPIR_KeywordPIR" but the paper's primary contribution is broader — it covers SealPIR optimizations, MulPIR (a new construction), Gentry-Ramzan improvements, and keyword PIR. The paper's actual title is "Communication-Computation Trade-offs in PIR."
Footnotes
-
Appendix A (p. 16): "Park and Tibouchi [51] present a construction that uses GSW-style homomorphic encryption that support logarithmic multiplicative degree and achieves O(log n) communication." ↩
-
Abstract (p. 1): "We study the computation and communication costs and their possible trade-offs in various constructions for private information retrieval (PIR), including schemes based on homomorphic encryption and the Gentry-Ramzan PIR." ↩
-
Section 1 (p. 2): "We analyze the communication-computation trade-offs that different PIR construction approaches offer and the hurdles towards achieving the optimal asymptotic communication costs in practice." ↩
-
Section 1.1 (p. 2): "Our first contribution reduces the communication of SealPIR by (1) using symmetric key encryption to reduce the upload size, (2) using modulus switching reduction techniques... to reduce the value of F down to F approximately 4, and (3) introducing a new oblivious expansion algorithm which can further halve the upload communication for some parameter sets." ↩
-
Section 1.1 (p. 2): "We then present MulPIR, a PIR protocol additionally leveraging multiplicative homomorphism to implement the recursion steps in PIR... it introduces a meaningful tradeoff by significantly reducing communication, at the cost of an increased computational cost for the server." ↩
-
Section 1.1 (p. 2-3): "We leverage a divide-and-conquer modular interpolation algorithm [7] that enables us to achieve computation complexity Õ(n log^2 n)... This enables a client-aided technique that allows us to improve the server's computation at the price of (small) additional work at the client." ↩
-
Section 1.1 (p. 3): "we introduce new ways to handle PIR over sparse databases (keyword PIR), based on different hashing techniques." ↩
-
Table 1 (p. 7): MulPIR (d=2) for n=2^20 shows upload 119 kB, download 119 kB — perfectly balanced communication. ↩
-
Table 5 (p. 13): Gentry-Ramzan (1 generator) achieves 0.5 kB upload / 1.3 kB download for 5,000-element database — orders of magnitude less communication than any HE scheme. ↩
-
Table 5 (p. 13): Client-Aided GR (50 generators) achieves lowest server cost at 0.0011 US cents for the 1MB database. ↩
-
Section 6.1 (p. 11): "For SealPIR, we use the parameters of [5]: polynomials of dimension 2048 and a modulus of 60 bits, providing 115 bits of security. The plaintext modulus has a size of 12 bits (d=2) / 16 bits (d=3)." The exact value t = 2^12 + 1 is stated in the Table 1 footnote (p. 7): "plaintext modulus t = 2^12 + 1." ↩
-
Table 1 footnote (p. 7): "For MulPIR, we use a polynomial of dimension 8192 with 50 + 2 * 55 bit modulus, modulus switching to 50 bits, and plaintext modulus t = 2^20 + 2^19 + 2^17 + 2^16 + 2^14 + 1." ↩
-
Section 3.1 (p. 6): "reduces the download size by approximately log_2(q)/(2 * log_2(t)); using SealPIR parameters and using modulus switching to a prime p approximately 2^25, this technique enables to reduce the download by a factor 60/25 = 2.4x." ↩
-
Section 4.1 (p. 8), Equation (2): "E <= product(pi_i), and E = D_i mod pi_i for all i in [n]." ↩
-
Procedure 8 (p. 9): "Q_1 := 2q_1 + 1 s.t. Q_1 and q_1 are prime and log_2(Q_1) >= λ. Q_2 := 2q_2*pi_k + 1 s.t. Q_2 and q_2 are prime and log_2(Q_2) >= λ." ↩
-
Table 7 / Appendix D (p. 19): ElGamal plaintext size is "pt small", expansion >= 2, decryption requires 2^{pt} multiplications of λ_EG-bit numbers. ↩
-
Appendix D (p. 19): "The ciphertext expansion F can be made as small as desired and F > 1. This unusual property enables to simplify the recursion in PIR." ↩
-
Algorithm 6, line 4 (p. 7): "m_j <- sum_{i in [2^c]} (2^{-c} mod t) * s'_j[i] * x^i in R/tR" — normalization folded into query. ↩
-
Section 3.2 (p. 7): "The key insight behind our new algorithms is that oblivious expansion (Algorithm 5) is linear over the plaintext space." ↩
-
Section 3.4 (p. 7-8): "We introduce MulPIR, a variant of SealPIR with the optimizations above, which further replaces the emulated multiplications with homomorphic multiplications during recursion." ↩
-
Section 4.2 (p. 9): "we rely on the modular interpolation algorithm by Borodin and Moenck [7]... Repeating the above transformation recursively leads to a divide-and-conquer algorithm for modular interpolation, which, using the Schonhage-Strassen integer multiplication [59], has a total running time of O(n log^2 n log log n) [7]." ↩
-
Procedure 8 (p. 9): Query output is (m, g) in Z x Z_m*, where m = Q_1 * Q_2 with each Q_i of λ bits. ↩
-
Section 4.3 (p. 9): "the client can use Eq. (3)... these l exponentiations may be efficiently computed by the client using the prime factorization of m." ↩
-
Section 4.3 (p. 9): "For our implementation, we choose Straus's algorithm [62], a description of which can be found in [39, Alg. 14.88]." ↩
-
Section 5, Construction 1 (p. 10): "The server generates cuckoo hash functions using the data dependent key generation Cuckoo.KeyGen(D) and builds a cuckoo hash table for its sparse database." ↩
-
Table 3 (p. 12): Communication and CPU costs of SealPIR and MulPIR for n elements of 288B. ↩
-
Table 5 (p. 13): Communication and computation costs for PIR protocols on 1MB database (5,000 elements of 288B). ↩
-
Table 2 (p. 8): "Communication-Computation Trade-Off of homomorphic encryption based PIR Protocols... This table aims at giving an insight on the overall trend but does not accurately reflect the costs." The Gentry-Ramzan row is editorially assembled from Section 4 (p. 8-9) and Table 8 (p. 20), which lists GR baseline communication as 3lambda_GR bits with computation of 2n*pt multiplications of lambda_GR-bit numbers. ↩
-
Table 1 (p. 7): For 2MB entries, SealPIR download is 200,294 kB vs MulPIR's 13,660 kB. ↩
-
Section 6 (p. 11): "All our experiments are performed in a virtual machine with a Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz and 128GB, running Debian. Monetary costs were computed using Google Cloud Platform prices [1], which at the time of writing were at one cent per CPU-hour and 8 cents per GB of internet traffic." ↩
-
Table 3 (p. 12): MulPIR server cost 0.0036 vs SealPIR 0.0040 at n=2^20 with 288B entries. ↩
-
Table 4 (p. 12): "MulPIR enables to reduce the communication of SealPIR by a factor 7x in that setting, which also results in a reduction of the monetary server costs by 35%." ↩
-
Table 5 footnote (p. 13): "Median over 10 computations. The timings indicated with approximately have been estimated on a smaller number of chunks to finish in a reasonable amount of time." ↩
-
Table 6 (p. 13): Password Checkup application comparing Gentry-Ramzan and MulPIR across bucket sizes. ↩
-
Section 3.2 (p. 7): "if m = sum_{i in [N]} m_i * x^i in R/tR, then the output of the oblivious expansion consists of N ciphertexts, respectively encrypting each of the m_i's in the constant coefficient of the plaintexts." ↩
-
Appendix B.2 (p. 18): "the server computes the response with the (encryption of the) selection vector s_1 using homomorphic multiplication... c = Enc(sk, <s_1, {D_{i'n^{1/2}+j'}}{i}>) = Enc(sk, D{i'n^{1/2}+j'})." ↩
-
Section 4.2 (p. 9): Borodin-Moenck modular interpolation achieving O(n log^2 n log log n) total running time. ↩
-
Section 4.3 (p. 9): "the server rewrites the large exponent E according to some base b >= 2... g^E = g^{E_0} * (g^b)^{E_1} * (g^{b^2})^{E_2} * ... * (g^{b^l})^{E_l}." ↩
-
Section 5 (p. 9-10): "The traditional setting for PIR over a database of size n assumes that each database element has a unique index in [n]... In some scenarios, such dense indices are not immediately available, and database elements are instead indexed by keywords from a much larger domain." ↩
-
Section 5 (p. 10): "our approach leverages cuckoo hashing in a different way... we apply cuckoo hashing on the server side, to compress the domain of its indices." ↩
-
Table 5 (p. 13): Comprehensive cost comparison across all implemented schemes on 1MB database. ↩
-
Table 1 (p. 7): Upload reduced from 61.4 kB to 15.4 kB (75%); download from 307.2 kB to 64 kB (79%) with Remark 1. ↩
-
Table 1 (p. 7): For 2MB entries, MulPIR download 13,660 kB vs SealPIR 200,294 kB. ↩
-
Table 3 (p. 12): MulPIR server Expand + Response time is approximately 2x SealPIR's at n=262,144 (2,310 ms vs 1,165 ms). ↩
-
Table 5 (p. 13): C.Create for Gentry-Ramzan is 3,294 ms for 1MB database; Table 6 shows 24,324 ms for password checkup with 10k bucket. ↩
-
Section 6.1 (p. 11): "we observe that d = 3 doesn't improve either communication or computation of MulPIR or SealPIR, due to the fact that the upload for d = 2 already consists of only a single ciphertext." ↩
-
Section 6 (p. 11): "These results can inform decision making of what is the most appropriate PIR instantiation for a particular application." ↩
-
Section 5 (p. 10): Simple hashing "has the drawback that the size of the buckets grows asymptotically with the number of items n" while cuckoo hashing "ensures that each bucket only contains a single database element." ↩
-
Remark 1 (p. 6): "one can further reduce the communication requirement at the cost of increasing the computation cost... perform n^{1/2} modulus switching." ↩
-
Section 6.1 (p. 11): "We use the SealPIR implementation available on Microsoft's GitHub [3] based on Seal 3.2.0, and we implement MulPIR with Seal 3.5.4." Section 6.2 (p. 11): "All the implementations are standalone and rely only on OpenSSL for BigNum and elliptic curve operations." ↩
-
Table 2 (p. 8): Somewhat HE achieves O(log n) communication at d = log n, but the computation cost includes n^{(d-1)/d} * M multiplicative operations. ↩
-
Table 5 (p. 13): GR server response time of 51,803 ms for 5,000 entries; extrapolation suggests impractical times for databases much larger than approximately 50k entries. ↩