Skip to content

Conversation

@bigspider
Copy link
Contributor

Description

Restores the functionality of HAVE_BOLOS_APP_STACK_CANARY.

Initializes the canary on os_io_init, and checks it on receive/sends.

This restores a behaviour similar to how it used to work in the old io_exchange: all apps call os_io_init in the preamble (the legacy code used io_seproxyhal_init instead), and we check the canary every time the io send/receive functions are called.

Since os_io_init, os_io_rx_evt and os_io_tx_cmd in os_io.c are only used if USE_OS_IO_STACK is not defined, the same initialization and checks are also implemented for the analogous weak functions defined in syscalls.c.

Changes include

  • Bugfix (non-breaking change that solves an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (change that is not backwards-compatible and/or changes current functionality)
  • Tests
  • Documentation
  • Other (for changes that might not fit in any category)

Auto cherry-pick in API_LEVEL

[x] TARGET_API_LEVEL: API_LEVEL_24
[x] TARGET_API_LEVEL: API_LEVEL_25

@codecov-commenter
Copy link

codecov-commenter commented Oct 13, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 62.41%. Comparing base (b68cbcf) to head (a4c52f6).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1273   +/-   ##
=======================================
  Coverage   62.41%   62.41%           
=======================================
  Files          14       14           
  Lines        1868     1868           
=======================================
  Hits         1166     1166           
  Misses        702      702           
Flag Coverage Δ
unittests 62.41% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bigspider bigspider changed the title Stack canary Restore the stack canary Oct 13, 2025
@bigspider bigspider marked this pull request as ready for review October 14, 2025 08:38
@bigspider
Copy link
Contributor Author

I verified on a locally built clone of the boilerplate:

  • the app crashes before trying to send theresponse if the canary is overwritten and DEFINES += HAVE_BOLOS_APP_STACK_CANARY is added to the Makefile
  • the app doesn't crash if the canary is overwritten without HAVE_BOLOS_APP_STACK_CANARY.

This will allow it to be accessible from several source files.
This restores a behaviour similar to how it used to work in the
old io_exchange: all apps call os_io_init in the preamble (the
legacy code used io_seproxyhal_init instead), and we check the
canary every time the io send/receive functions are called.

Since os_io_init, os_io_rx_evt and os_io_tx_cmd in os_io.c are
only used if USE_OS_IO_STACK is not defined, the same
initialization and checks are also implemented for the analogous
weak functions defined in syscalls.c.
@bigspider
Copy link
Contributor Author

bigspider commented Oct 14, 2025

In a4c52f6, I replaced os_reset with os_sched_exit, which allows to cleanly exit the app with a recognizable exit value (at least on speculos; on the real device it exits to the dashboard with no response). The constant 42 was chosen for obvious reasons.

Comment on lines -51 to -53
#ifdef HAVE_BOLOS_APP_STACK_CANARY
#define APP_STACK_CANARY_MAGIC 0xDEAD0031
#endif // HAVE_BOLOS_APP_STACK_CANARY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to check if this causes issues with some applications still using legacy IO and the canary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing should change for them. These are moved in os_io.h, which is imported by os_io_legacy.
The only difference is that this define is now visible in anything that imports os_io.h, so it could only conflict if they have a different define with the same name.

Comment on lines +1900 to +1905
#ifdef HAVE_BOLOS_APP_STACK_CANARY
// if the canary is corrupted, reset the device
if (app_stack_canary != APP_STACK_CANARY_MAGIC) {
os_sched_exit(APP_STACK_CANARY_CORRUPTED_EXIT_VALUE);
}
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can factorize this snippet

Copy link
Contributor Author

@bigspider bigspider Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worried about introducing an additional function because if we know the stack is already corrupted, we should use it as little as possible.
Perhaps we can use a macro.

#ifdef HAVE_BOLOS_APP_STACK_CANARY
// if the canary is corrupted, reset the device
if (app_stack_canary != APP_STACK_CANARY_MAGIC) {
os_sched_exit(APP_STACK_CANARY_CORRUPTED_EXIT_VALUE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the perfect world it should be automatically tested against an app with and without legacy IO stack.
But we do not seem to have an infrastructure for this.
Might be https://github.com/LedgerHQ/speculos/tree/master/tests/c, but also it does not seem to fit.

#ifdef HAVE_BOLOS_APP_STACK_CANARY
app_stack_canary = APP_STACK_CANARY_MAGIC;
#endif // HAVE_BOLOS_APP_STACK_CANARY

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to not modify the syscall APIs.
Why not init in an early app start entrypoint?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the syscalls APIs are the only place used for io for most apps (that is, after io revamp and defining USE_OS_IO_STACK). So even if you find a different place for initialization, you still need to modify syscalls.c to check for canary corruption on os_io_rx_evt/os_io_tx_cmd.

(I chose os_io_init because I checked that both the C and the Rust SDK app preamble calls it before any real application code.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Syscalls are not called directly by apps, there is SDK functions that can be used. Let's discuss to find a better solution.

Copy link
Contributor

@bboilot-ledger bboilot-ledger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from a security standpoint. I agree with other comments especially about modifying syscalls APIs. Unfortunately, I do not see neither where we could put those snippet so it used by apps by default without any snippet needed in the app's codebase.
Is there any impact regarding apps built with zxlib/sdk?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants