Skip to content

Conversation

@ibvqeibob
Copy link
Contributor

Adds a trap_reason field to Sail traps and exposes it to the C/C++ callback interface so that the emulator can log more detailed reasons for traps.

Main changes:

  • Introduce a debug-only trap_reason union and add reason : trap_reason to sync_exception.
  • Add handle_exception_with_reason(xtval, e, reason) and keep the existing handle_exception as a wrapper using Trap_reason_none().
  • Extend trap_handler, exception_handler and trap_callback to take and propagate a trap_reason.
  • Expose struct ztrap_reason from sail_riscv_model.h as typedef struct ztrap_reason trap_reason and update the C/C++ trap_callback(bool, fbits, trap_reason) interface.
  • For Step_Fetch_Failure and Step_Execute(Memory_Exception(...)), pass Access_not_in_physical_memory() as an initial example reason.
  • log_callbacks::trap_callback prints both the trap cause and reason for debugging.

Testing:

  • Rebuilt sail_riscv_sim successfully.
  • Ran a small ELF that triggers an instruction fetch access fault and observed:
    trap: is_interrupt=0 cause=1 and reason: access_not_in_physical_memory in the log.

let t : sync_exception = struct { trap = e,
excinfo = xtval_exception_value(e, xtval),
ext = None() };
function handle_exception_with_reason(xtval : xlenbits, e : ExceptionType, r : trap_reason) -> unit = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall but I personally dont think that we should rename the function. handle_exception() was good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!
I've kept handle_exception(xtval, e) as the main interface and added handle_exception_with_reason(xtval, e, r) as a helper. The old handle_exception now just forwards to the helper with Trap_reason_none(), so existing call sites can continue to use the original name.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I missed that. But why not just modify handle_exception() function signature and pass the reason instead? Unless I’m missing something, that seems simpler. As of today, there are three call sites for handle_exception()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that makes sense.
I've now removed handle_exception_with_reason and changed handle_exception to take (xtval, e, reason : trap_reason) directly. All existing call sites have been updated to pass an explicit trap_reason.
For the memory-related traps (Step_Fetch_Failure and Memory_Exception) I pass Access_not_in_physical_memory(), and for other traps I currently pass Trap_reason_none() until we add more specific reasons for them.

let t : sync_exception = struct { trap = e,
excinfo = xtval_exception_value(e, xtval),
ext = None() };
function handle_exception_with_reason(xtval : xlenbits, e : ExceptionType, r : trap_reason) -> unit = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I missed that. But why not just modify handle_exception() function signature and pass the reason instead? Unless I’m missing something, that seems simpler. As of today, there are three call sites for handle_exception()

Step_Execute(Trap(priv, ctl, pc), _) => set_next_pc(exception_handler(priv, ctl, pc)),
Step_Execute(Memory_Exception(vaddr, e), _) => handle_exception(bits_of(vaddr), e),
Step_Execute(Memory_Exception(vaddr, e), _) => handle_exception_with_reason(bits_of(vaddr), e, Access_not_in_physical_memory()),
Step_Execute(Illegal_Instruction(), instbits) => handle_exception(zero_extend(instbits), E_Illegal_Instr()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you miss to pass a reason for Illegal_Instruction()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — I missed that case initially.
Step_Execute(Illegal_Instruction(), instbits) now calls: handle_exception(zero_extend(instbits), E_Illegal_Instr(), Trap_reason_none())

Comment on lines 297 to 306
function handle_exception(xtval : xlenbits, e : ExceptionType) -> unit = {
handle_exception_with_reason(xtval, e, Trap_reason_none())
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does Trap_reason_none() represent? A trap should always have a reason, or is this just a temporary placeholder and reasons are added gradually over time?

Unless I am missing something, I would remove it entirely (Trap_reason_none()) or at least wrap it in an option type like option(trap_reason).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trap_reason_none() is intended to mean "there is no additional debug information beyond the architectural ExceptionType".
Right now only a subset of traps populate a more specific trap_reason (for example the memory-related fetch/data access faults that motivated this change). For the rest, we still want to plumb a trap_reason value through the interfaces so that the C callbacks have a uniform signature, but we don't yet have anything more precise to attach, hence the Trap_reason_none() constructor.
I did consider modelling this as option(trap_reason) as you suggest, but using an explicit Trap_reason_none keeps the generated C interface a bit simpler.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isn't the whole point of this PR to get more clarity on why we trapped? If most call sites just pass Trap_reason_none(), we're not really getting that clarity, we just have extra work.

It’s just my opinion, but I think we should not use Trap_reason_none(). It would be better to create meaningful messages, for example in the make_landing_pad_exception case. You could simply add a new reason like ‘landing pad exception’ or something like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right. I've reworked the patch along those lines:

  • Introduced Trap_reason_unclassified as an explicit "no extra information (yet)" value, and restricted its use to cases
    where we genuinely don't have anything more to say (e.g. interrupts, some generic/legacy paths).
  • For fetch and generic memory exceptions, access faults (E_Fetch_Access_Fault, E_Load_Access_Fault, E_SAMO_Access_Fault) now use Access_not_in_physical_memory()instead of an uninformative default.
  • make_landing_pad_exception now uses a dedicated Landing_pad_exception() trap reason, so landing-pad-related software checks are distinguishable on the C side.

@github-actions
Copy link

github-actions bot commented Dec 8, 2025

Test Results

2 115 tests  ±0   2 115 ✅ ±0   20m 28s ⏱️ +9s
    1 suites ±0       0 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit cbab7d3. ± Comparison against base commit 7cf10dd.

♻️ This comment has been updated with latest results.

@nadime15 nadime15 requested a review from Timmmm December 9, 2025 16:22
Comment on lines 53 to 58
union trap_reason = {
Trap_reason_unclassified : unit,
Access_not_in_physical_memory : unit,
Invalid_pte_reserved_bits_nonzero : (physaddrbits, xlenbits),
Landing_pad_exception : unit
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
union trap_reason = {
Trap_reason_unclassified : unit,
Access_not_in_physical_memory : unit,
Invalid_pte_reserved_bits_nonzero : (physaddrbits, xlenbits),
Landing_pad_exception : unit
}
union TrapReason = {
Trap_Reason_Unclassified : unit,
Access_Not_In_Physical_Memory : unit,
Invalid_Pte_Reserved_Bits_Nonzero : (physaddrbits, xlenbits),
Landing_Pad_Exception : unit
}

Seems like the majority of the code basis uses for union snake_case, with capital letters

union trap_reason = {
Trap_reason_unclassified : unit,
Access_not_in_physical_memory : unit,
Invalid_pte_reserved_bits_nonzero : (physaddrbits, xlenbits),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like Invalid_Pte_Reserved_bits_nonzero is not used?

Comment on lines 160 to 174
function memory_exception_reason(e : ExceptionType) -> trap_reason =
match e {
// These ExceptionType values indicate that the memory access
// failed at the level of the implemented physical memory
// (e.g. no backing memory / PMA issues / bus errors).
E_Fetch_Access_Fault() => Access_not_in_physical_memory(),
E_Load_Access_Fault() => Access_not_in_physical_memory(),
E_SAMO_Access_Fault() => Access_not_in_physical_memory(),

// All other exceptions (alignment, page faults, envcalls,
// software checks, extension-specific traps, etc.) are not yet
// classified more finely here.
_ => Trap_reason_unclassified()
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something that confuses me a bit, is that currently we do not provide precise reasons for why an exception was triggered. At the moment, this approach is quite similar to simply enabling and running with --trace-exception.

I think what needs to be done is, for example, in the case of a memory exception during pt_walk, we should return the underlying cause, as defined in PTW_Error. One idea would be to modify translateAddr() and implement something like:

Err(f, ext_ptw) => {
  let exc = translationException(ac, f);
  let reason = ptw_error_to_trap_reason(f);
  Err((exc, reason), ext_ptw)
}
function ptw_error_to_trap_reason(f : PTW_Error) -> trap_reason = {
  match f {
    PTW_No_Access()               => Access_not_in_physical_memory(),
    // the one below is a new union entry
    PTW_Reserved_Bits(addr, pte) => Invalid_pte_reserved_bits_nonzero(bits_of(addr), pte),
    ...
    _                            => Trap_reason_unclassified()
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve added a TrapReason union and wired PTW_Error to it as suggested, so PTW-related memory exceptions now carry both the ExceptionType and a more precise TrapReason.

@ibvqeibob
Copy link
Contributor Author

a

@ibvqeibob ibvqeibob closed this Dec 13, 2025
@nadime15
Copy link
Collaborator

Was there a specific reason for closing this PR @ibvqeibob?

@ibvqeibob ibvqeibob reopened this Dec 20, 2025
@ibvqeibob
Copy link
Contributor Author

Was there a specific reason for closing this PR @ibvqeibob?

I did not fully understand this part of the process before, so I closed the PR and reviewed it carefully. This revision passes the TrapReason (a more specific trap cause) all the way from the Sail model to the trap_callback in C.The implementation method is as follows: add TrapReason on the Sail side and map the corresponding causes in the exception generation paths (especially the memory exceptions related to page table walk/address translation); meanwhile, update the C callback interface signature to pass the reason parameter to the upper-layer callback.

Copy link
Collaborator

@pmundkur pmundkur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has a lot of unrelated changes from other PRs. Could you rebase properly to isolate the changes for the traps/callbacks?

@ibvqeibob
Copy link
Contributor Author

This has a lot of unrelated changes from other PRs. Could you rebase properly to isolate the changes for the traps/callbacks?

Thanks for the heads-up! I’ve rebased the branch onto the latest upstream/master and squashed the work into a single commit. The PR now contains only the TrapReason plumbing from Sail to the C++ trap_callback, with no unrelated commits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants