Skip to content

Conversation

@fkhaidari
Copy link
Contributor

According to the debug specification (select bit description):

0 (address): There is at least one compare value and it contains the lowest virtual address of the access. In addition, it is recommended that there are additional compare values for the other accessed virtual addresses match. (E.g. on a 32-bit read from 0x4000, the lowest address is 0x4000 and the other addresses are 0x4001, 0x4002, and 0x4003.)

1 (data): There is exactly one compare value and it contains the data value loaded or stored, or the instruction executed. Any bits beyond the size of the data access will contain 0.

Previously, when select bit was 0, Spike did not follow the recommendation and provided only 1 matching value to the trigger module. This change modifies the behavior to the recommended one.

The implementation follows the debug specification recommendation.

@fkhaidari
Copy link
Contributor Author

@aswaterman, could you take a look?

riscv/mmu.cc Outdated
bytes += sizeof(reg_t);
auto xlen_bytes = proc->get_xlen();
while (len > xlen_bytes) {
check_triggers(triggers::OPERATION_LOAD, transformed_addr, access_info.effective_virt, xlen_bytes, reg_from_bytes(xlen_bytes, bytes));
Copy link
Collaborator

@aswaterman aswaterman Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic doesn't look right to me. The total number of bytes reported to check_triggers will in general be len rounded up to the next multiple of XLEN/8, rather than simply len. (Consider e.g. the case of the lb instruction, where len is 1.)

Don't you want to use len instead of xlen_bytes in both places?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The total number of bytes reported to check_triggers will in general be len rounded up to the next multiple of XLEN/8

As I understand it would be equal to len exactly. This while loop handles len rounded down to the multiple of XLEN / 8. Tail is handled after while loop here:

check_triggers(triggers::OPERATION_LOAD, transformed_addr, access_info.effective_virt, len, reg_from_bytes(len, bytes));

Don't you want to use len instead of xlen_bytes in both places?

Here we check data triggers. Comparing loaded data with tdata2. Size of tdata2 is XLEN. That's why I am iterating over read memory with stride equal xlen_bytes

Did I understand your question correct?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I understand. I will leave it to @en-sc to review the semantics with that fact in mind.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I'm not sure this code is correct.

AFAIU, this case can be triggered for a vector memory operation with SEW > XLEN. This case is governed by [5.5.2. Combined Accesses]:

Some instructions lead a hart to perform multiple memory accesses. This includes vector loads and
stores, as well as cm.push and cm.pop instructions. The Trigger Module should match such accesses as
if they all happened individually. E.g. a vector load should be treated as if it performed multiple loads
of size SEW (selected element width), and cm.push should be treated as if it performed multiple stores
of size XLEN.

It seems like there should be a clarification made for the case of a data value being wider then XLEN. For now the select field description only covers sizes less or equal to XLEN.

1 (data): There is exactly one compare value and it contains
the data value loaded or stored, or the instruction
executed. Any bits beyond the size of the data access will
contain 0.

On a side note, do cache management operations come through here too? If so, this needs to be addressed. See [5.5.3. Cache Operations] for details. However, I'd suggest to add a TODO and address it sometime later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto trig_addr = transformed_addr;
auto xlen_bytes = proc->get_xlen();
while (trig_len > xlen_bytes) {
check_triggers(triggers::OPERATION_STORE, trig_addr, access_info.effective_virt, xlen_bytes, reg_from_bytes(sizeof(reg_t), trig_bytes));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto mask = make_mask64(0, xlen);
value &= mask;

for (std::size_t i = 0; i < len; ++i)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why in increments of 1 rather than words?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For address triggers. This for loop checks every address in the accessed range

value &= mask;

for (std::size_t i = 0; i < len; ++i)
if (simple_match(xlen, (value + i) & mask)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the logic here. Why are we adding the current byte offset to the data value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For data value len == 1 so there is no problem here

  reg_t value;
  if (select) {
    if (!data.has_value())
      return std::nullopt;
    value = *data;
    len = 1;
  } else {
    value = address;
  }

Copy link
Collaborator

@aswaterman aswaterman Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's very subtle. I agree it appears to be correct, but it probably deserves a comment that the + i only applies in the address case, for the reason you mentioned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the call to the simple_match() can be moved into both branches of if (select) (with the loop in the else branch), dropping the value variable?
I believe this will make the code clearer and will eliminate the hack.

Copy link
Collaborator

@aswaterman aswaterman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@en-sc Can you help review?

…g specification behavior

According to the debug specification (select bit description):

0 (address): There is at least one compare value and it contains
the lowest virtual address of the access. In addition, it is recommended
that there are additional compare values for the other accessed
virtual addresses match. (E.g. on a 32-bit read from 0x4000,
the lowest address is 0x4000 and the other addresses
are 0x4001, 0x4002, and 0x4003.)

1 (data): There is exactly one compare value and it contains
the data value loaded or stored, or the instruction executed.
Any bits beyond the size of the data access will contain 0.

Previously, when select bit was 0, Spike did not follow
the recommendation and provided only 1 matching value to the trigger module.
This change modifies the behavior to the recommended one.

The implementation follows the debug specification recommendation.

Signed-off-by: Farid Khaidari <[email protected]>
@fkhaidari
Copy link
Contributor Author

@aswaterman, this MR may be a little bit confusing. I believe the problem in signature of the original function:

void check_triggers(triggers::operation_t operation, reg_t address, bool virt, std::optional<reg_t> data = std::nullopt);

In this parameter to be precise: std::optional<reg_t> data = std::nullopt. If data == std::nullopt it behaves like address trigger checker and checks whole range. Otherwise it is data trigger checker. Probably it is better to improve this interface:

void check_data_triggers(triggers::operation_t operation, reg_t address, bool virt, reg_t data);
void check_addr_triggers(triggers::operation_t operation, reg_t address, bool virt, reg_t tval);

Same for std::optional<match_result_t> module_t::detect_memory_access_match(operation_t operation, reg_t address, std::optional<reg_t> data) and virtual std::optional<match_result_t> trigger_t::detect_memory_access_match(processor_t * const proc, operation_t operation, reg_t address, std::optional<reg_t> data).

So I decided to avoid this splitting. I may introduce it in one of the future MRs

@aswaterman
Copy link
Collaborator

@fkhaidari You're probably right that it would be clearer to split check_triggers into two check_data_triggers and check_addr_triggers.

@en-sc
Copy link
Contributor

en-sc commented Nov 26, 2025

it would be clearer to split check_triggers into two check_data_triggers and check_addr_triggers.

@fkhaidari, @aswaterman, unfortunately this will not work.

A data trigger can be chained with an address trigger (e.g. a watchpoint on a store of a particular value, see here).
For this to work, we will need for both, address and data, to be available when checking the chain.

}

std::optional<match_result_t> mcontrol_common_t::detect_memory_access_match(processor_t * const proc, operation_t operation, reg_t address, std::optional<reg_t> data) noexcept {
std::optional<match_result_t> mcontrol_common_t::detect_memory_access_match(processor_t * const proc, operation_t operation, reg_t address, size_t len, std::optional<reg_t> data) noexcept {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::optional<match_result_t> mcontrol_common_t::detect_memory_access_match(processor_t * const proc, operation_t operation, reg_t address, size_t len, std::optional<reg_t> data) noexcept {
std::optional<match_result_t> mcontrol_common_t::detect_memory_access_match(processor_t * const proc, operation_t operation, reg_t address, std::size_t len, std::optional<reg_t> data) noexcept {

For consistency with the index in the loop.

value &= 0xffffffff;
}
auto mask = make_mask64(0, xlen);
value &= mask;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to either:

  1. Move the masking into the simple_match().
  2. Add an assertion in the simple_match() that the value is masked.

value &= mask;

for (std::size_t i = 0; i < len; ++i)
if (simple_match(xlen, (value + i) & mask)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the call to the simple_match() can be moved into both branches of if (select) (with the loop in the else branch), dropping the value variable?
I believe this will make the code clearer and will eliminate the hack.

riscv/mmu.cc Outdated
bytes += sizeof(reg_t);
auto xlen_bytes = proc->get_xlen();
while (len > xlen_bytes) {
check_triggers(triggers::OPERATION_LOAD, transformed_addr, access_info.effective_virt, xlen_bytes, reg_from_bytes(xlen_bytes, bytes));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I'm not sure this code is correct.

AFAIU, this case can be triggered for a vector memory operation with SEW > XLEN. This case is governed by [5.5.2. Combined Accesses]:

Some instructions lead a hart to perform multiple memory accesses. This includes vector loads and
stores, as well as cm.push and cm.pop instructions. The Trigger Module should match such accesses as
if they all happened individually. E.g. a vector load should be treated as if it performed multiple loads
of size SEW (selected element width), and cm.push should be treated as if it performed multiple stores
of size XLEN.

It seems like there should be a clarification made for the case of a data value being wider then XLEN. For now the select field description only covers sizes less or equal to XLEN.

1 (data): There is exactly one compare value and it contains
the data value loaded or stored, or the instruction
executed. Any bits beyond the size of the data access will
contain 0.

On a side note, do cache management operations come through here too? If so, this needs to be addressed. See [5.5.3. Cache Operations] for details. However, I'd suggest to add a TODO and address it sometime later.

@aswaterman
Copy link
Collaborator

For this to work, we will need for both, address and data, to be available when checking the chain.

Ah, right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants