Skip to content

Handle address translation for misaligned loads and stores better #861

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 7, 2025

Conversation

pmundkur
Copy link
Collaborator

@pmundkur pmundkur commented Apr 15, 2025

Refactor the LOAD and STORE instruction so they split misaligned
accesses into multiple sub-accesses and perform address translation
separately. This means we should handle the case where a misaligned
access straddles a page boundary in a sensible way, even if we don't
yet cover the full range of possibilities allowed for any RISC-V
implementation.

There are options for the order in which misaligned happen, i.e. from
high-to-low or from low-to-high as well as the granularity of the splitting,
either all the way to bytes or to the largest aligned size. The splitting
can also be disabled if an implementation supports misaligned accesses in hardware.

In addition tidy up the implementation in a few ways:

  • Very long lines on the LOAD encdec were fixed by adding a helper

  • Add some linebreaks in the code so it reads as less claustrophobic

  • Ensure we use the same names for arguments in encdec/execute/assembly.
    Previously we used 'size' and 'width'. I opted for 'width' consistently.

Co-authored-by: Alasdair Armstrong [email protected]

@pmundkur pmundkur requested a review from Alasdair April 15, 2025 20:04
Copy link

github-actions bot commented Apr 15, 2025

Test Results

400 tests  ±0   400 ✅ ±0   1m 44s ⏱️ -1s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit f25e661. ± Comparison against base commit 3bdcf27.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@nadime15 nadime15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks awesome! Just a few comments

Comment on lines 392 to 397
match vmem_write(vaddr, width_bytes, data, aq, rl, false) {
Ok(_) => RETIRE_SUCCESS,
Err(vaddr, e) => {
handle_mem_exception(vaddr, e);
return RETIRE_FAIL
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
match vmem_write(vaddr, width_bytes, data, aq, rl, false) {
Ok(_) => RETIRE_SUCCESS,
Err(vaddr, e) => {
handle_mem_exception(vaddr, e);
return RETIRE_FAIL
}
match vmem_write(vaddr, width_bytes, data, aq, rl, false) {
Ok(_) => RETIRE_SUCCESS,
Err(vaddr, e) => { handle_mem_exception(vaddr, e); return RETIRE_FAIL }

To match model/riscv_insts_fext.sail and the rest.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will change anyway once I rebase on top of #755.

Comment on lines 307 to 308
Err(vaddr, e) => { handle_mem_exception(vaddr, e); RETIRE_FAIL },
Ok(result) => { F(rd) = nan_box(result); RETIRE_SUCCESS }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Err(vaddr, e) => { handle_mem_exception(vaddr, e); RETIRE_FAIL },
Ok(result) => { F(rd) = nan_box(result); RETIRE_SUCCESS }
Ok(result) => { F(rd) = nan_box(result); RETIRE_SUCCESS }
Err(vaddr, e) => { handle_mem_exception(vaddr, e); RETIRE_FAIL },

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done.

function prop_access_within_is_aligned(addr : bits(32), bytes : bits(4)) -> bool = {
let bytes = unsigned(zero_extend(32, 0b1) << unsigned(bytes));
if bytes > 0 then {
access_within(addr, bytes, bytes) == (fmod_int(unsigned(addr), bytes) == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is fmod_int() doing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. It's integer modulus. I now notice there are two of these: emod_int and fmod_int. The first always returns a positive value, the second ('f' stands for 'floor' according to GMP docs) retains the sign of the divisor. Not sure the difference matters here. @Alasdair?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fdiv and fmod are flooring, so they round down always. tdiv and tmod are truncating so they round towards zero. emod and ediv are euclidian division. Wikipedia has good summary of the differences here https://en.wikipedia.org/wiki/Modulo#Variants_of_the_definition

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason we really have euclidian division is it's the definition used in the SMT integer theory

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like since these are both positive, they’d all be the same? In which case we might as well use %, which is already defined in the prelude to mean emod.

Copy link
Collaborator Author

@pmundkur pmundkur Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SMT properties check out in both cases, so I'll switch it to % when I rebase.

@pmundkur pmundkur force-pushed the ldst_misaligned_take2 branch from 9323a72 to a23d5fc Compare April 18, 2025 19:45
@pmundkur
Copy link
Collaborator Author

The last push rebased on master and refactored #467 to pull the ext_data_get_addr calls into the vmem_utils helpers. This cleans up the calling code significantly, and (hopefully) isolates almost all the RVWMO-relevant pieces for usual loads/stores into vmem_utils.

As before, AMOs, fetch, and CBOs are untouched.

There is a slight difference in alignment checks for LOADRES/STORECON in the read/write helpers. I'll try to clean it up before final merge.

@Alasdair could you check if this breaks RVWMO modelling in any way? There's a comment in vmem_utils that I retained from your #467 that may need adjusting?

@pmundkur
Copy link
Collaborator Author

Hmm, the CI failure is odd. It's building locally (and passing tests) with Sail 0.19.

@pmundkur
Copy link
Collaborator Author

Hmm, the CI failure is odd. It's building locally (and passing tests) with Sail 0.19.

Drat, I think this is triggering a bug in the smt backend of Sail, both 0.19 and the latest master.

@pmundkur pmundkur force-pushed the ldst_misaligned_take2 branch from a23d5fc to d5ffd8c Compare April 18, 2025 21:16
@pmundkur pmundkur added the tgmm-agenda Tagged for the next Golden Model meeting agenda. label Apr 18, 2025
@pmundkur pmundkur force-pushed the ldst_misaligned_take2 branch 2 times, most recently from 066874c to fec2fa5 Compare April 21, 2025 23:56
@pmundkur
Copy link
Collaborator Author

@bacam Could you take a look at this Rocq failure? Perhaps it's being triggered by the match being inside a repeat? Adding a let _ : unit = match { ... }; annotation did not help.

Copy link
Collaborator

@jordancarlin jordancarlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looking like an amazing simplification!


// If the Zama16b extension is enabled, the region_width must be at least 16
let region_width = if currentlyEnabled(Ext_Zama16b) then {
max_int(16, region_width)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
max_int(16, region_width)
max(16, region_width)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

See the [Virtual Memory Notes](./notes_Virtual_Memory.adoc) for
details.

- The `riscv_vmem_utils.sail` file provides a higher level interface
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding a more detailed description to the notes_Virtual_Memory.adoc file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what can go in there in addition to what's already described in comments in riscv_vmem_utils.sail.

@pmundkur pmundkur force-pushed the ldst_misaligned_take2 branch from fec2fa5 to 597d678 Compare April 22, 2025 14:09
@pmundkur pmundkur force-pushed the ldst_misaligned_take2 branch 2 times, most recently from 4fd40f9 to 50034c4 Compare April 24, 2025 15:01
/* If the load is misaligned or an allowed misaligned access, split into `n`
(single-copy-atomic) memory operations, each of `bytes` width. If the load is
aligned, then `n` = 1 and bytes will remain unchanged. */
let ('n, bytes) = split_misaligned(vaddr, width);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can work around the Rocq problem by changing bytes to 'bytes - this gives data a printable type that can be used in rewrites. (Rocq also needs a couple of termination measures, but I'll give those separately.)

@bacam
Copy link
Collaborator

bacam commented Apr 25, 2025

For Rocq output we also need to add two termination measures to riscv_termination.sail:

termination_measure vmem_write_addr repeat n
termination_measure vmem_read repeat n

Although it would be helpful if someone could check that n is the correct limit in both cases.

@pmundkur
Copy link
Collaborator Author

For Rocq output we also need to add two termination measures to riscv_termination.sail:

termination_measure vmem_write_addr repeat n
termination_measure vmem_read repeat n

Although it would be helpful if someone could check that n is the correct limit in both cases.

Yeah, I think n is the correct limit since the loop should repeat at most n times in both cases.

Your suggestions fixed the build. Thanks!

@pmundkur pmundkur force-pushed the ldst_misaligned_take2 branch from 132308b to 3c0a61a Compare April 28, 2025 23:21
Copy link
Contributor

@nadime15 nadime15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

It would be great to prioritize and get this merged so work on the other PRs can move forward!

@pmundkur pmundkur force-pushed the ldst_misaligned_take2 branch from 3c0a61a to 677dab9 Compare April 29, 2025 20:23
Copy link
Collaborator

@jordancarlin jordancarlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now that the configuration issue has been resolved. This will be great to get in!

@pmundkur pmundkur mentioned this pull request Apr 29, 2025
@pmundkur pmundkur force-pushed the ldst_misaligned_take2 branch from 677dab9 to e268359 Compare April 30, 2025 17:12
@pmundkur pmundkur removed the tgmm-agenda Tagged for the next Golden Model meeting agenda. label Apr 30, 2025
@pmundkur
Copy link
Collaborator Author

pmundkur commented May 5, 2025

I'll merge this tomorrow.

@pmundkur pmundkur added the will be merged Scheduled to be merged in a few days if nobody objects label May 5, 2025
Copy link
Collaborator

@Timmmm Timmmm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind waiting a day for me to take a more detailed look? Sorry I've been putting it off but I'll look tomorrow. I'd like to make sure it works for us and also with #866.

Also I think we really need to clean up the AccessType system because the behaviour is kind of half split between AccessType and the res/aq/con flags. I kind of think we should remove those flags and move everything into AccessType. (In future of course.)

@pmundkur pmundkur force-pushed the ldst_misaligned_take2 branch from e268359 to 4453ef4 Compare May 6, 2025 01:43
@pmundkur pmundkur force-pushed the ldst_misaligned_take2 branch from 4453ef4 to 8b7febc Compare May 6, 2025 14:38
pmundkur and others added 2 commits May 6, 2025 09:43
Refactor the LOAD and STORE instruction so they split misaligned
accesses into multiple sub-accesses and perform address translation
separately. This means we should handle the case where a misaligned
access straddles a page boundary in a sensible way, even if we don't
yet cover the full range of possibilities allowed for any RISC-V
implementation.

There are options for the order in which misaligned happen, i.e. from
high-to-low or from low-to-high as well as the granularity of the splitting,
either all the way to bytes or to the largest aligned size. The splitting
can also be disabled if an implementation supports misaligned accesses in hardware.

In addition tidy up the implementation in a few ways:

- Very long lines on the LOAD encdec were fixed by adding a helper

- Add some linebreaks in the code so it reads as less claustrophobic

- Ensure we use the same names for arguments in encdec/execute/assembly.
  Previously we used 'size' and 'width'. I opted for 'width' consistently.

Primary author: Alasdair Armstrong <[email protected]>

Co-authored-by: Alasdair Armstrong <[email protected]>
Add some comments on the API available from `vmem_utils`.

Update Makefile.old for SMT properties.

Update the ReadingGuide.
@pmundkur pmundkur force-pushed the ldst_misaligned_take2 branch from 8b7febc to 3477a6f Compare May 6, 2025 14:49
Copy link
Collaborator

@Timmmm Timmmm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM!

I think we will probably want to make this more configurable in future, but it at least fixes the page straddling problem which is currently totally broken... 🚀

@Timmmm Timmmm changed the title Better handling of misaligned accesses (take 2) Better handling of misaligned accesses May 7, 2025
@Timmmm Timmmm changed the title Better handling of misaligned accesses Handle address translation for misaligned loads and stores better May 7, 2025
@Timmmm Timmmm added this pull request to the merge queue May 7, 2025
Merged via the queue into riscv:master with commit cd50ea2 May 7, 2025
2 checks passed
github-merge-queue bot pushed a commit that referenced this pull request May 8, 2025
Lean does not use termination measures for loops, so we need to guard
their definitions.

The problematic measures was introduced by #861.
@pmundkur pmundkur deleted the ldst_misaligned_take2 branch June 10, 2025 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
will be merged Scheduled to be merged in a few days if nobody objects
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants