Generated: 2026-02-11
| AWS m6i.xlarge | Azure DC4ads_v5 | GCP c3-standard-4 | |
|---|---|---|---|
| CPU | Intel Xeon 8375C (Ice Lake) | AMD EPYC 7763 (Milan) | Intel Sapphire Rapids (4th Gen Xeon) |
| vCPUs | 4 | 4 | 4 |
| Memory | 15 GiB | 14 GiB | 14 GiB |
| Security | Standard VM (none) | SEV-SNP Confidential VM | TDX Confidential VM |
| Kernel | 6.1.159 (Amazon Linux 2023) | 6.8.0-1044-azure-fde | 6.14.0-1021-gcp |
| Rust | 1.93.0 | 1.93.0 | 1.93.0 |
| Date | 2026-02-07 | 2026-02-07 | 2026-02-11 |
| Metric | AWS m6i | Azure DC4ads_v5 | GCP c3-standard-4 |
|---|---|---|---|
| p50 | 139.0 us | 168.2 us | 142.9 us |
| p95 | 139.7 us | 174.7 us | 148.0 us |
| p99 | 143.0 us | 189.1 us | 155.2 us |
| 95% CI | [138.9 .. 139.1] us | [168.1 .. 168.4] us | [142.7 .. 143.0] us |
Takeaway: AWS and GCP within 3% of each other (both Intel). Azure ~21% slower (AMD EPYC Milan vs Intel Xeon). This measures X25519 keygen + HKDF — CPU crypto performance, not memory encryption.
| Metric | AWS m6i | Azure DC4ads_v5 | GCP c3-standard-4 |
|---|---|---|---|
| Plaintext duplex | 20.2 us | 101.1 us | 84.4 us |
| SC p50 | 33.2 us | 123.8 us | 108.0 us |
| Overhead | 64.1% | 22.4% | 27.9% |
| Metric | AWS m6i | Azure DC4ads_v5 | GCP c3-standard-4 |
|---|---|---|---|
| Plaintext duplex | 20.9 us | 103.7 us | 85.4 us |
| SC p50 | 42.1 us | 139.5 us | 124.8 us |
| Overhead | 100.8% | 34.5% | 46.1% |
384KB hidden state ([128, 768] F32)
| Metric | AWS m6i | Azure DC4ads_v5 | GCP c3-standard-4 |
|---|---|---|---|
| Plaintext duplex | 148.4 us | 199.6 us | 214.3 us |
| SC p50 | 1.85 ms | 1.62 ms | 1.86 ms |
| Overhead | 1148% | 709% | 766% |
Takeaway: AWS has the lowest absolute latency for small messages but the highest crypto overhead %. Azure/GCP have higher baseline duplex latency (memory encryption overhead from SEV-SNP/TDX) but the crypto overhead % is lower because the baseline is already higher. For large payloads (384KB), Azure is fastest — AMD EPYC Milan's memory subsystem handles large AEAD operations well.
| Payload | AWS m6i | Azure DC4ads_v5 | GCP c3-standard-4 |
|---|---|---|---|
| 1536B embedding | 488 MB/s | 411 MB/s | 403 MB/s |
| 4KB activation | 751 MB/s | 618 MB/s | 604 MB/s |
| 384KB hidden | 825 MB/s | 725 MB/s | 918 MB/s |
| 1MB large | 765 MB/s | 519 MB/s | 785 MB/s |
Takeaway: AWS wins on small-message throughput (no memory encryption overhead on plaintext path). GCP TDX wins on large-message throughput — Sapphire Rapids' memory bandwidth advantage with TDX is smaller than SEV-SNP's on Milan. All platforms sustain >400 MB/s encrypted throughput even for the smallest payload.
| Component | AWS m6i | Azure DC4ads_v5 | GCP c3-standard-4 |
|---|---|---|---|
| Handshake | 0.14% | 0.17% | 0.14% |
| AEAD RTT (1536B) | 0.03% | 0.12% | 0.11% |
| Total overhead | <0.17% | <0.29% | <0.25% |
Bottom line: Transport crypto overhead is <0.3% of inference time on all three platforms, including real confidential VMs with hardware memory encryption (SEV-SNP, TDX). The encryption cost is negligible compared to model computation.
-
Confidential VM memory encryption is measurable but small. SEV-SNP (Azure) adds ~4-5x to baseline duplex latency for small messages vs non-confidential (AWS). TDX (GCP) adds ~4x. This is the cost of AES-XTS memory encryption at the hardware level.
-
Our AEAD overhead is consistent across platforms. ChaCha20-Poly1305 adds 22-64% on top of the already-encrypted baseline (for small messages). The crypto is CPU-bound and behaves predictably.
-
Large messages are dominated by memory bandwidth, not crypto. At 384KB, all platforms converge to 700-920 MB/s encrypted throughput. The AEAD seal/open becomes a smaller fraction of total time.
-
No platform is a clear overall winner. AWS is fastest for small messages (no memory encryption overhead). GCP TDX is fastest for large messages. Azure SEV-SNP is in the middle. Choose based on workload shape.
- All measurements over
tokio::io::duplex(in-process, no network) - Mock attestation (real attestation adds ~1-5ms to handshake, amortized over session)
- Same Rust toolchain (1.93.0) on all platforms
- AWS is a non-confidential baseline; Azure and GCP are real confidential VMs
- Throughput differences partly reflect burst count changes between benchmark runs