Skip to content

fix(cpu): detect PCLMULQDQ support to prevent SIGILL on Hygon/QEMU#934

Open
liuq19 wants to merge 2 commits into
bytedance:mainfrom
liuq19:fix/detect-pclmulqdq
Open

fix(cpu): detect PCLMULQDQ support to prevent SIGILL on Hygon/QEMU#934
liuq19 wants to merge 2 commits into
bytedance:mainfrom
liuq19:fix/detect-pclmulqdq

Conversation

@liuq19

@liuq19 liuq19 commented Mar 31, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Replace PCLMULQDQ with scalar prefix-XOR in the SSE native code path, so Sonic works on CPUs without PCLMULQDQ support
  • Add HasPCLMULQDQ runtime detection; AVX2 path requires it, SSE path does not
  • Regenerate all SSE native assembly (35 files)

Background

Both AVX2 and SSE native code paths used PCLMULQDQ / VPCLMULQDQ instructions for carry-less multiplication in the SIMD JSON string scanner (native/scanning.h:_mm_clmulepi64_si128). The CPU feature detection only checked for AVX2 and SSE — not PCLMULQDQ.

This caused SIGILL crashes on:

  • Hygon (海光) Dhyana CPUs under QEMU/KVM without host-passthrough
  • QEMU VMs with restrictive CPU models that don't expose PCLMULQDQ

The previous workaround SONIC_MODE=noavx2 (suggested in #867) was insufficient because the SSE fallback path also used PCLMULQDQ (non-VEX encoded 66 0F 3A 44).

Changes

File Change
native/scanning.h Gate _mm_clmulepi64_si128 behind #ifdef USE_PCLMUL; scalar prefix-XOR fallback
scripts/build-x86.sh SSE: -mno-pclmul; AVX2: -mpclmul -DUSE_PCLMUL=1
internal/native/sse/* Regenerated 35 files — no PCLMULQDQ instructions
internal/cpu/features.go Add HasPCLMULQDQ = cpuid.CPU.Has(cpuid.CLMUL)
internal/native/dispatch_amd64.go AVX2 requires HasAVX2 && HasPCLMULQDQ; SSE works without

Dispatch logic

HasAVX2 + HasPCLMULQDQ  →  AVX2 path (uses VPCLMULQDQ)
HasSSE (no PCLMULQDQ)   →  SSE path  (scalar prefix-XOR fallback)

The scalar prefix-XOR is mathematically equivalent to clmul(x, 0xFFFFFFFFFFFFFFFF):

// 6-step shift-XOR cascade — computes running XOR (prefix sum in GF(2))
uint64_t inquote = quote_mask;
inquote ^= inquote << 1;
inquote ^= inquote << 2;
inquote ^= inquote << 4;
inquote ^= inquote << 8;
inquote ^= inquote << 16;
inquote ^= inquote << 32;

Fixes #867

Test plan

  • go build ./... passes
  • All tests pass with default mode (AVX2 path): go test ./... -count=1
  • All tests pass with SSE-only mode: SONIC_MODE=noavx go test ./... -count=1
  • Verified SSE assembly contains zero pclmulqdq instructions
  • Verified AVX2 assembly still contains vpclmulqdq (unchanged)
  • Verify on Hygon QEMU VM without PCLMULQDQ: no SIGILL, SSE path auto-selected

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings March 31, 2026 08:55
@codecov-commenter

codecov-commenter commented Mar 31, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 51.27%. Comparing base (59be92f) to head (5df9851).
⚠️ Report is 71 commits behind head on main.

Files with missing lines Patch % Lines
internal/cpu/features.go 0.00% 2 Missing ⚠️
internal/native/dispatch_amd64.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #934      +/-   ##
==========================================
- Coverage   51.86%   51.27%   -0.60%     
==========================================
  Files         127      172      +45     
  Lines       10893    14149    +3256     
==========================================
+ Hits         5650     7255    +1605     
- Misses       4920     6503    +1583     
- Partials      323      391      +68     
Flag Coverage Δ
arm 42.89% <0.00%> (?)
macos-latest 44.01% <0.00%> (?)
ubuntu-24.04-arm 42.89% <0.00%> (?)
ubuntu-latest 49.72% <0.00%> (?)
x86 49.72% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds PCLMULQDQ (CLMUL) CPU feature detection and uses it to fail fast on amd64 CPUs/VMs that would otherwise hit illegal SIMD instructions during native dispatch, preventing SIGILL crashes in Sonic’s SIMD JSON parsing path.

Changes:

  • Add HasPCLMULQDQ feature flag via cpuid.CLMUL in internal/cpu.
  • Guard internal/native amd64 dispatch with a PCLMULQDQ check and panic with an actionable message when missing.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
internal/cpu/features.go Introduces HasPCLMULQDQ based on CPUID CLMUL support.
internal/native/dispatch_amd64.go Adds an init-time check to prevent dispatching into SIMD code paths that can SIGILL without PCLMULQDQ.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/native/dispatch_amd64.go Outdated
Both AVX2 and SSE native code paths previously used PCLMULQDQ
(carry-less multiplication) for computing the in-quote bitmask in
the SIMD JSON string scanner. This caused SIGILL crashes on CPUs or
VMs that lack PCLMULQDQ, such as Hygon Dhyana under QEMU/KVM.

Changes:
- native/scanning.h: gate `_mm_clmulepi64_si128` behind `#ifdef
  USE_PCLMUL`; provide scalar prefix-XOR fallback for the SSE path
- scripts/build-x86.sh: SSE compiles with `-mno-pclmul`; AVX2
  compiles with `-mpclmul -DUSE_PCLMUL=1`
- Regenerate all SSE native code (no PCLMULQDQ instructions)
- internal/cpu/features.go: add `HasPCLMULQDQ` detection via
  `cpuid.CLMUL`
- internal/native/dispatch_amd64.go: AVX2 path now requires both
  `HasAVX2 && HasPCLMULQDQ`; SSE path works without PCLMULQDQ

Dispatch logic:
  HasAVX2 + HasPCLMULQDQ → AVX2 (with VPCLMULQDQ)
  HasSSE (no PCLMULQDQ)  → SSE  (scalar prefix-XOR fallback)

Fixes bytedance#867

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@liuq19 liuq19 force-pushed the fix/detect-pclmulqdq branch from 86bdeb2 to 54a512c Compare March 31, 2026 11:54
- SONIC_MODE=noavx/noavx2 now also sets HasPCLMULQDQ=false
- Add TestNoPCLMULQDQ_SubProcess: verifies CPU flags are correctly
  disabled under SONIC_MODE=noavx
- Add TestNoPCLMULQDQ_JSON: end-to-end test running JSON marshal,
  unmarshal, GetByPath, and validation through the SSE path (scalar
  prefix-XOR) in a subprocess with SONIC_MODE=noavx

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SIGILL: illegal instruction

4 participants