Skip to content

HOTP asked to be resealed even if TOTP good (Picks up on a reinstalled OS even if firmware measurements haven't changed) #1935

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

tlaurion
Copy link
Collaborator

@tlaurion tlaurion commented Mar 17, 2025

Fixes #1562
Supeseeds #1934 + reviewed changes


@marmarek/ @JonathonHall-Purism : good enough for merge?
EDIT: Thanks @JonathonHall-Purism for your review. All comments addressed.


Changes:

  • No more multi-console 3 attempts of TPMTOTP unsealing for race condition management when simultaneous TPM usage (introduced on Talos-2 port since dual console output on BMC and display consoles) : now if TPM unseal fails, we die
    • Now die() asks to press Enter key, which is clearer for UX to understand what fails
  • Add of DEBUG + TRACE_FUNC calls and call stack traces (this is 280cb1f which is PR DEBUG+TRACE mode: provide a complete call stack trace on console / debug.log #1934)
  • Bumps hotp-verification to 1.7+ unreleased fixes under 1.71/1.8 so that hotp_verification output is on multi-lines even if hiccups on physical presence detection
  • Clarifies TPM counter creation + increment (I had bad time understanding why things were not working. Easier to understand and debug if needed in the future, easy to understand if board is in DEBUG+TRACE mode)
    • Added TRACE_FUNC (which now outputs call hierarchy) along the way so that when in DEBUG+TRACE mode, call trace is now easy to follow for the future.
  • ident to tabs on all code reviewed.
  • When going to recovery shell: guide user into how to provide logs

@tlaurion
Copy link
Collaborator Author

tlaurion commented Mar 18, 2025

@marmarek

This is ./docker_repro.sh make BOARD=qemu-coreboot-fbwhiptail-tpm2-hotp-prod_quiet USB_TOKEN=Nitrokey3NFC PUBKEY_ASC=pubkey.asc inject_gpg run with nk3 passed to testing qube.

Simulating OS reinstallation (wiping /boot/kexec*)

So same firmware, meaning:

  • Same firmware, with public key fused in rom and measured. So TPMTOTP good. Should warn that HOTP counter doesn't exist and prompt for HOTP reseal.
  • Let's wipe /boot/kexec* files:
    2025-03-17-200019
  • Reboot. And Then:
    2025-03-17-200115
    2025-03-17-200208
  • Then selection non-existing default boot should pick up and guide user into signing and selecting default:
    2025-03-17-200244
    2025-03-17-200319
    2025-03-17-200408
  • And then finally ask if TPM DUK must be created (optional):
    2025-03-17-195901

@JonathonHall-Purism

Other changes

  • Stubborn users not following on screen instructions still are reminded that they haven't followed instructions. die() now requires to "Press a any key:
    2025-03-17-201600
    2025-03-17-201614
    2025-03-17-201632
    2025-03-17-201655
    2025-03-17-201729
    2025-03-17-201755
  • rest as usual.

@tlaurion tlaurion marked this pull request as draft March 18, 2025 00:21
@tlaurion tlaurion changed the title WiP - HOTP asked to be resealed even if TOTP good (Picks up on a reinstalled OS even if firmware measurements haven't changed) HOTP asked to be resealed even if TOTP good (Picks up on a reinstalled OS even if firmware measurements haven't changed) Mar 18, 2025
@tlaurion tlaurion marked this pull request as ready for review March 18, 2025 00:21
@tlaurion
Copy link
Collaborator Author

@JonathonHall-Purism in debug.

This is ./docker_repro.sh make BOARD=qemu-coreboot-whiptail-tpm2-hotp USB_TOKEN=Nitrokey3NFC PUBKEY_ASC=pubkey.asc inject_gpg run

To show 280cb1f

  • User does reseal TPMTOTP/HOTP instead of reset TPM:
    2025-03-17-202616
    2025-03-17-202819

So now user is locked in into following steps:

  • Reset TPM, which reseals TPM/HOTP + updates checksums and sign /boot
  • Be happy (user guided to do the right thing

Full debug log of the combined steps (user doing Reset TPM as asked up to setting TPM DUK) with way more interesting call stack for everyone to understand:
full_debug_trace-from_tpm_reset-to_TPM_DUK.txt

Note:
For those who doesn't own a NK3/Librem Key : use non-hotp variants. qemu boards enforce canokey for GPG OpenPGP smartcard operations, follow targets/qemu.md, starting with ./docker_repro.sh make BOARD=qemu-coreboot-fbwhiptail-tpm2. You can learn Heads, or develop/contribute by running non prod_quiet versions, which will output in TRACE+DEBUG mode.

@marmarek
Copy link
Contributor

To be clear - previously after OS reinstall (including removal of /boot/kexec* files) the recommendation was to do full TPM reset, and now just resealing the HOTP secret should work, right?

@@ -29,7 +29,13 @@ mount_boot_or_die
#counter_value=$(read_tpm_counter $counter | cut -f2 -d ' ' | awk 'gsub("^000e","")')
#

counter_value=$(cat $HOTP_COUNTER)
#if HOTP_COUNTER is not present, bail out
if [ ! -f $HOTP_COUNTER ]; then
Copy link
Collaborator Author

@tlaurion tlaurion Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marmarek the issue was that previously, if TOTPM previously sealed/unsealed (measured boot from coreboot+heads), Heads was not looking to see if HOTP counter under /boot/kexec_hotp_counter was still present.

@@ -280,7 +269,10 @@ update_hotp() {
HOTP='N/A'
fi

if [[ "$CONFIG_TPM" = n && "$HOTP" = "Invalid code" ]]; then
if [[ "$HOTP" = "Invalid code" ]]; then
Copy link
Collaborator Author

@tlaurion tlaurion Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check only verified if HOTP was invalid if no TPM was in use.

So now, if there is no /boot/kexec_hotp_counter and TPMTOTP can unseal, user is promoted to reseal HOTP alone (OS reinstall use case without firmware upgrade)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what is we have kexec_rollback.txt? All of this doesn't make any sense: user should reset TPM here of more logic needs to be refactored.

Copy link
Collaborator Author

@tlaurion tlaurion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marmarek added comments in code: with this PR, HOTP counter not being present will guide user to only reseal HOTP and generate hashes and /boot detached signed digest files, as well as selecting default boot and propose to set TPM DUK.

So TPM reset is not needed anymore in case of OS re-installation for TPM1.
For TPM2, TPM primary handle still needs to be created+hashed, which is TPM Reset is advised for creation and hash creation advised for in output as well.

This is workaround for you issue.

@tlaurion tlaurion force-pushed the hotp_fixup_without_firmware_upgrade_boot_wiped branch from 6159012 to 75e766a Compare March 18, 2025 15:42
Copy link
Collaborator

@JonathonHall-Purism JonathonHall-Purism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tlaurion, strategy looks good to me and I left comments on some of the details 💯

@@ -789,12 +820,17 @@ increment_tpm_counter() {
TRACE_FUNC
tpmr counter_increment -ix "$1" -pwdc '' |
tee /tmp/counter-$1 >/dev/null 2>&1 ||
die "TPM counter increment failed for rollback prevention. Please reset the TPM"
die "TPM counter increment failed for rollback prevention. Please reset the TPM. Press Enter to continue"
#TODO: why the die here needs to say to Press Enter to continue? Should be part of die call?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe because oem-factory-reset has its own die()? It's not equivalent to this one though so you can't just delete it (kills some TOP_PID it captured rather than exiting) 🤔

@tlaurion tlaurion marked this pull request as draft March 25, 2025 16:28
@tlaurion tlaurion force-pushed the hotp_fixup_without_firmware_upgrade_boot_wiped branch 4 times, most recently from 5cfcc73 to 47c696f Compare March 25, 2025 22:25
@@ -719,7 +719,7 @@ tpm1_reset() {
DO_WITH_DEBUG tpm physicalsetdeactivated -c &>/dev/null
DO_WITH_DEBUG tpm forceclear &>/dev/null
DO_WITH_DEBUG tpm physicalenable &>/dev/null
DO_WITH_DEBUG tpm takeown -pwdo "$tpm_owner_password" &>/dev/null
DO_WITH_DEBUG --mask-position 3 tpm takeown -pwdo "$tpm_owner_password" &>/dev/null
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEBUG was exposing TPM owner passphrase on log + debug.log

@tlaurion tlaurion force-pushed the hotp_fixup_without_firmware_upgrade_boot_wiped branch 2 times, most recently from c38c13b to dae9e85 Compare March 25, 2025 22:33
@@ -5,9 +5,9 @@ find /boot/kexec*.txt | gpg --verify /boot/kexec.sig -
#remove invalid kexec_* signed files
mount /dev/sda1 /boot && mount -o remount,rw /boot && rm /boot/kexec* && mount -o remount,ro /boot
#Generate keys on OpenPGP smartcard:
mount-usb && gpg --home=/.gnupg/ --card-edit
mount-usb --mode rw && gpg --home=/.gnupg/ --card-edit
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bash history now promotes mount-usb --mode rw. nitpick

…ives call hierarchy, fix HOTP resealing only on OS reinstall, clarify TPM increment workflow

Signed-off-by: Thierry Laurion <[email protected]>
@tlaurion tlaurion force-pushed the hotp_fixup_without_firmware_upgrade_boot_wiped branch from dae9e85 to 1f6a975 Compare March 26, 2025 03:12
@tlaurion tlaurion marked this pull request as ready for review March 26, 2025 03:17
@@ -183,17 +183,6 @@ update_totp() {
TOTP="NO TPM"
else
TOTP=$(unseal-totp)
# On platforms using CONFIG_BOOT_EXTRA_TTYS multiple processes may try to
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more 3 attempts on boot to unseal TPMTOTP: if multiple consoles (Eg Talos-2 with display console + BMC, at worst we could intruduce small delay if race condition still happening, while die asks user to press Enter now, guiding to reseal TPMTOTP or reset TPM if unable to access TPM NVRAM.

@tlaurion
Copy link
Collaborator Author

tlaurion commented Mar 26, 2025

  • Using it on nv41, wiped /boot/kexec* : simulating OS reinstall: ok
  • Tested under qemu tpm2:
    • Wiped build///vtpm : reseal TPMTOTP/HOTP dies telling user reset tpm needed TPM not owned) : ok
    • Wiped /boot/kexechotp : Reseal HOTP picks up when TPMTOTP is good: tests your reinstall case: ok

@marmarek
Copy link
Contributor

marmarek commented Mar 27, 2025

Password prompt (order) is wrong - it says to enter GPG User PIN, but then asks for TPM Owner Password:
heads-pr-1935-wrong-prompt2
The GPG user PIN prompt comes after entering TPM owner password.

But then, after the prompt for updating default boot selection, it says to reset the TPM due to missing rollback counter. Wasn't it supposed to be not needed anymore?

@marmarek
Copy link
Contributor

marmarek commented Mar 27, 2025

When I followed instructions to do TPM reset, it hanged on a pin prompt (and also a bit confusing that says admin pin attempts when the prompt is about user pin):
heads-pr-1935-tpm-reset-hang
I waited for 5 minutes and nothing happened. LED on the Nitrokey is off.

@marmarek
Copy link
Contributor

Oh, it actually did wait for something, I pressed Enter (without entering any PIN) and got this:
heads-pr-1935-tpm-reset-hang2

@marmarek
Copy link
Contributor

At that prompt that hanged before, if I try to enter the user PIN, I got the same error after entering the first character, so it seems it wasn't really waiting for the PIN...

@tlaurion tlaurion marked this pull request as draft March 27, 2025 13:17
@tlaurion
Copy link
Collaborator Author

tlaurion commented Mar 27, 2025

Preparing test environement

cd heads
#copying pubkey.asc to be injected in rom for testing.
cp ~/Insurgo_2024_no_expiration_date.asc pubkey.asc
#hardlinking test disk image to root.qcow2 used by qemu board
sudo cp -alf ~/qemu-disks/debian-12-2_luks.qcow2 /home/user/heads/build/x86/qemu-coreboot-fbwhiptail-tpm2-hotp-prod_quiet/root.qcow2
#Creating disk snapshot to revert to
sudo qemu-img snapshot ~/qemu-disks/debian-12-2_luks.qcow2 -c "kexec-signed"
#preparing host to mount partitions from raw disk image
sudo modprobe nbd
qemu-nbd --connect=/dev/nbd0 ./build/x86/qemu-coreboot-fbwhiptail-tpm2-hotp-prod_quiet/root.qcow2
sudo qemu-nbd --connect=/dev/nbd0 ./build/x86/qemu-coreboot-fbwhiptail-tpm2-hotp-prod_quiet/root.qcow2
#mounting boot partition from raw disk image
sudo mount /dev/nbd0p1 /media
#wiping Heads related /boot states signed under kexec.sig, including kexec_rollback.txt used for TPM counter rollback protection and TPM2 primary handle
sudo rm /media/kexec*
sudo umount /media
sudo qemu-nbd --disconnect /dev/nbd0
#creating snapshot to revert to so that /boot simulates a clean OS reinstall
sudo qemu-img snapshot ~/qemu-disks/debian-12-2_luks.qcow2 -c "kexec-wiped"

Used to iterate testing:

#wipe host vTPM (not TPM ownership: TPM blank)
sudo rm ./build/x86/qemu-coreboot-fbwhiptail-tpm2-hotp-prod_quiet/vtpm/*
#reset /boot to having no Heads related kexec*.txt files signed under kexec.sig/kexec_hotp_counter for HOTP
sudo qemu-img snapshot ~/qemu-disks/debian-12-2_luks.qcow2 -a "kexec-wiped"
#run qemu with NK3 passedthrough, with pubkey.asc injected in rom
./docker_repro.sh make BOARD=qemu-coreboot-fbwhiptail-tpm2-hotp-prod_quiet USB_TOKEN=Nitrokey3NFC PUBKEY_ASC=pubkey.asc inject_gpg run

Test case 1: TPM cleared, after OS install with Heads reownership done

sudo qemu-img snapshot ~/qemu-disks/debian-12-2_luks.qcow2 -a "kexec-wiped"
sudo rm ./build/x86/qemu-coreboot-fbwhiptail-tpm2-hotp-prod_quiet/vtpm/*
./docker_repro.sh make BOARD=qemu-coreboot-fbwhiptail-tpm2-hotp-prod_quiet USB_TOKEN=Nitrokey3NFC PUBKEY_ASC=pubkey.asc inject_gpg run

Power on:
2025-03-27-103210
Do not know. Try Generate new HOTP/TOTP secret on non-provisioned TPM:
2025-03-27-103504
Okok... i'll reset TPM as the ERROR message says I should do...:
2025-03-27-103631
Signs /boot stuff as well, order of prompts ok:
2025-03-27-103723
2025-03-27-103748
So we signed /boot, set TPMTOTP and sealed through reverse HOTP:
2025-03-27-103909
Heads detects no default boot was setuped, prompts
2025-03-27-104017
2025-03-27-104033
Onboards defining one + TPM DUK
2025-03-27-104209
2025-03-27-104224
2025-03-27-104914
Then prompts to sign changed /boot/kexec*.txt files...

TODO, fix

  • subshell calls messing up console output (GPG PIN warning vs die() prompts to press Enter key)
  • fix file descriptor leak (pipe) error
  • silence "Failed to setup async io, using sync io" prior of cryptsetup call (lvm code)

@tlaurion tlaurion assigned tlaurion and unassigned marmarek Mar 27, 2025
…hecksums. Warn user prior of effectively booting (shows console warning, wait 2s then reboot)

Signed-off-by: Thierry Laurion <[email protected]>
@tlaurion
Copy link
Collaborator Author

tlaurion commented Mar 28, 2025

My nk3 was having non default setting.

gpg --card-status

Signature PIN ....: not forced

Doing git rebase operations (and therefore signing commits individually) just doesn't make sense with USB Security dongle's protected private keys with Signature forced. Not sure if this toggle should be on actually, but out of scope of this PR.

Changing this for sake of testing.

gpg --card-edit
gpg/card> admin
Admin commands are allowed

gpg/card> help
quit           quit this menu
admin          show admin commands
help           show this help
list           list all available data
name           change card holder's name
url            change URL to retrieve key
fetch          fetch the key specified in the card URL
login          change the login name
lang           change the language preferences
salutation     change card holder's salutation
cafpr          change a CA fingerprint
forcesig       toggle the signature force PIN flag
generate       generate new keys
passwd         menu to change or unblock the PIN
verify         verify the PIN and list all data
unblock        unblock the PIN using a Reset Code
factory-reset  destroy all keys and data
kdf-setup      setup KDF for PIN authentication
key-attr       change the key attribute

gpg/card> forcesig

Seems like we come to a time where caching GPG User PIN is desired.

… prompt for recovery shell access, state where debug logs are in centralized way

Note for linuxboot#1888:
warn in code is used mostly to actually warn user of something requiring his attention, and pausing for 2 seconds.

Goal is:
die: blocking: tell user that something failed, requiring acknowledgement for corrective actions.
warn: display "WARNING:" prepended messages which pauses for 2 seconds prior of continuing. This is not an error, nor INFO
INFO: gives a trace to the user when in QUIET mode, under /tmp/debug.log related to core components output, typically related to measurements traces.

Consequently, putting what is currently under warn->INFO wold be console silenced. We want to get rid of manual "echo +++++" messages.
So it seems we lack what is currently named INFO to go into measurement_log, and INFO (green), warn (yellow) and die (red) messages to console.

Signed-off-by: Thierry Laurion <[email protected]>
Signed-off-by: Thierry Laurion <[email protected]>
… being set: observed in fbwhiptail-tpm2-hotp-prod_quiet

  991 root      3272 S    {gui-init} /bin/bash /bin/gui-init
 2024 root      2792 S    {kexec-select-bo} /bin/bash /bin/kexec-select-boot -
 2025 root      1364 S    sha256sum -c /tmp/kexec/kexec_default_hashes.txt
 2105 root      2068 S    /bin/bash

Signed-off-by: Thierry Laurion <[email protected]>
…. Logs for first under usb.raw to check against HOTP reseal

Signed-off-by: Thierry Laurion <[email protected]>
@JonathonHall-Purism
Copy link
Collaborator

@tlaurion Looked over the recent changes, could you cherry pick this tweak to finish addressing my nitpicks please 🙂 ad807f4

I see you have a few TODOs left to address, it looks OK to me otherwise, let me know when you would like further review of those fixes

tlaurion and others added 8 commits April 28, 2025 14:09
…hrase equiv) + easthetic fixes

Signed-off-by: Thierry Laurion <[email protected]>
…p_without_firmware_upgrade_boot_wiped-staging

Signed-off-by: Thierry Laurion <[email protected]>
Asking to press Enter is more forgiving than "any key" and good, but we
also have to actually continue on Enter instead of any key.

Signed-off-by: Jonathon Hall <[email protected]>
… sleep one second before continuing

Signed-off-by: Thierry Laurion <[email protected]>
…l selected containers prior of prompting for new DUK

Signed-off-by: Thierry Laurion <[email protected]>
…nderstanding and debugging

Signed-off-by: Thierry Laurion <[email protected]>
…d leaves 1 second to the user to read the notice

Signed-off-by: Thierry Laurion <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to create rollback file after OS reinstall (Regenerate TOTP/HOTP)
3 participants