Skip to content

fix: ipmi rmcptag and cbcpad#7555

Merged
Obihoernchen merged 4 commits into
xcat2:masterfrom
VersatusHPC:fix/ipmi-rmcptag-and-cbcpad
May 6, 2026
Merged

fix: ipmi rmcptag and cbcpad#7555
Obihoernchen merged 4 commits into
xcat2:masterfrom
VersatusHPC:fix/ipmi-rmcptag-and-cbcpad

Conversation

@viniciusferrao
Copy link
Copy Markdown
Member

@Obihoernchen this one was only possible with the help of AI assisted tools, so the explanation is also up to them:

Problem

xCAT's IPMI implementation can't talk to OpenBMC-based BMCs. All commands (rpower, rsetboot, rinv, rvitals) either timeout or crash with the splice error from #7511. Meanwhile, ipmitool works fine on the same hardware with the same credentials.

Reported on Lenovo XCC3 (#7511). We reproduced it on an IBM POWER9 AC922 (also OpenBMC-based). @Obihoernchen has seen the same class of failure on Intel/Mitac BMCs.

What we found

We instrumented IPMI.pm with syslog traces and did packet captures against a POWER9 BMC to compare xCAT's RMCP+ handshake with ipmitool's. Three things were wrong:

1. OpenBMC returns message tag 0 in RAKP2

The IPMI spec says RAKP Message 2 should echo the tag from RAKP Message 1. OpenBMC returns 0 instead. xCAT rejected every RAKP2 as stale and retried until timeout:

IPMI_RAKP2: rmcptag check: got=0 expected=137
IPMI_RAKP2: rmcptag check: got=0 expected=139
IPMI_RAKP2: rmcptag check: got=0 expected=141
... (128 times, then timeout)

The same check exists in got_rmcp_response and got_rakp4.

2. xCAT doesn't set the name-only lookup bit in RAKP1

After fixing the tag, sessions got through but failed with "Unauthorized name". Packet capture showed the difference: ipmitool sends the RAKP1 privilege byte as 0x14 (admin + name-only lookup), while xCAT sent 0x04.

The name-only lookup bit (bit 4, IPMI spec Table 13-17) tells the BMC to find the user by name alone rather than by name+privilege. Without it, some BMCs reject valid credentials.

This byte is also part of the HMAC input for RAKP2 verification, RAKP3 auth code, and SIK derivation, so all three had to use the updated value to keep the hashes correct.

3. cbc_pad crashes on bad padding (the original #7511)

cbc_pad in decrypt mode reads the last byte as a pad count and does splice(@block, 0 - $count). If decryption produced garbage, the count could exceed the array length, crashing with "Modification of non-creatable array value attempted, subscript -16".

What this PR does

Four commits:

  1. Return empty string from cbc_pad when padding is invalid, so corrupted decryption is treated as a packet failure instead of crashing.

  2. Accept message tag 0 in got_rmcp_response, got_rakp2, and got_rakp4, but only after verifying the remote console session ID matches our current sidm. This keeps the stale-response protection while letting OpenBMC responses through.

  3. Set the name-only lookup bit in RAKP1 and use the same privilege byte in all HMAC calculations (RAKP2 check, RAKP3 auth code, SIK). Matches what ipmitool does.

  4. Extend the sha256-to-sha1 fallback (already in got_rmcp_response) to also cover RAKP2 auth rejections.

Tested on

BMC rpower rsetboot rinv rvitals
IBM POWER9 AC922 (OpenBMC) on Hard Drive SDR unsupported* -
Lenovo XCC2 on boot override inactive BMC Firmware: 2.81 Ambient: 26 C
Supermicro X10DRW on boot override inactive BMC Firmware: 3.89 CPU1: 23 C
Dell iDRAC on boot override inactive BMC Firmware: 7.20 Exhaust: 44 C

* SDR reservation is an OpenBMC limitation, unrelated to this fix.

Before this PR, the POWER9 BMC timed out on every xCAT IPMI command. After, all four BMC types work. We don't have a Lenovo XCC3 or Intel/Mitac to test, so those need the reporter @nuttapongc or @Obihoernchen to verify.

Fixes #7511

cbc_pad in decrypt mode reads the last byte as the pad count, then
calls splice(@block, 0 - $count). If decrypted data is corrupt, the
pad count can exceed the array size, crashing with "Modification of
non-creatable array value attempted, subscript -16".

Return empty string on invalid padding so the caller treats it as a
decryption failure rather than accepting corrupted data as a valid
IPMI response.

Ref: xcat2#7511
OpenBMC-based BMCs return message tag 0 in RAKP2/RAKP4 instead of
echoing the tag from the request. xCAT rejected these as stale
responses and retried indefinitely until timeout.

Accept tag 0 but verify the remote console session ID in the response
matches our current sidm. This prevents stale retries from corrupting
session state while allowing OpenBMC responses through.

Applied to got_rmcp_response, got_rakp2, and got_rakp4.

Ref: xcat2#7511
Set bit 4 (0x10) of the requested privilege byte in RAKP Message 1
for name-only user lookup, matching ipmitool behavior. Use the same
value consistently in all HMAC calculations (RAKP2 verification,
RAKP3 auth code, SIK derivation).

Without this, some BMCs fail user lookup with "Unauthorized name"
even though the credentials are correct.

Ref: xcat2#7511
Extend the existing sha256-to-sha1 fallback (already present in
got_rmcp_response for Open Session errors) to also cover RAKP2
rejections with "Unauthorized name" (0x0d) or "Invalid role" (0x09).

Ref: xcat2#7511
Copy link
Copy Markdown
Member

@Obihoernchen Obihoernchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on Intel/Mitac as well. Works fine.

@Obihoernchen Obihoernchen added this to the 2.18 milestone May 6, 2026
@Obihoernchen Obihoernchen merged commit a90ef27 into xcat2:master May 6, 2026
2 checks passed
@viniciusferrao viniciusferrao deleted the fix/ipmi-rmcptag-and-cbcpad branch May 7, 2026 03:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rsetboot error with Modification of non-creatable array value attempted, subscript -16 at /opt/xcat/lib/perl/xCAT/IPMI.pm line 766.

2 participants