-
Notifications
You must be signed in to change notification settings - Fork 1k
Update trigger behavior for memory accesses to match recommended debug specification behavior #2161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
1225677 to
966e2af
Compare
|
@aswaterman, could you take a look? |
riscv/mmu.cc
Outdated
| bytes += sizeof(reg_t); | ||
| auto xlen_bytes = proc->get_xlen(); | ||
| while (len > xlen_bytes) { | ||
| check_triggers(triggers::OPERATION_LOAD, transformed_addr, access_info.effective_virt, xlen_bytes, reg_from_bytes(xlen_bytes, bytes)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic doesn't look right to me. The total number of bytes reported to check_triggers will in general be len rounded up to the next multiple of XLEN/8, rather than simply len. (Consider e.g. the case of the lb instruction, where len is 1.)
Don't you want to use len instead of xlen_bytes in both places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The total number of bytes reported to check_triggers will in general be len rounded up to the next multiple of XLEN/8
As I understand it would be equal to len exactly. This while loop handles len rounded down to the multiple of XLEN / 8. Tail is handled after while loop here:
check_triggers(triggers::OPERATION_LOAD, transformed_addr, access_info.effective_virt, len, reg_from_bytes(len, bytes));Don't you want to use len instead of xlen_bytes in both places?
Here we check data triggers. Comparing loaded data with tdata2. Size of tdata2 is XLEN. That's why I am iterating over read memory with stride equal xlen_bytes
Did I understand your question correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I understand. I will leave it to @en-sc to review the semantics with that fact in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I'm not sure this code is correct.
AFAIU, this case can be triggered for a vector memory operation with SEW > XLEN. This case is governed by [5.5.2. Combined Accesses]:
Some instructions lead a hart to perform multiple memory accesses. This includes vector loads and
stores, as well as cm.push and cm.pop instructions. The Trigger Module should match such accesses as
if they all happened individually. E.g. a vector load should be treated as if it performed multiple loads
of size SEW (selected element width), and cm.push should be treated as if it performed multiple stores
of size XLEN.
It seems like there should be a clarification made for the case of a data value being wider then XLEN. For now the select field description only covers sizes less or equal to XLEN.
1 (data): There is exactly one compare value and it contains
the data value loaded or stored, or the instruction
executed. Any bits beyond the size of the data access will
contain 0.
On a side note, do cache management operations come through here too? If so, this needs to be addressed. See [5.5.3. Cache Operations] for details. However, I'd suggest to add a TODO and address it sometime later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see riscv/riscv-debug-spec#1183
| auto trig_addr = transformed_addr; | ||
| auto xlen_bytes = proc->get_xlen(); | ||
| while (trig_len > xlen_bytes) { | ||
| check_triggers(triggers::OPERATION_STORE, trig_addr, access_info.effective_virt, xlen_bytes, reg_from_bytes(sizeof(reg_t), trig_bytes)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check: #2161 (comment)
| auto mask = make_mask64(0, xlen); | ||
| value &= mask; | ||
|
|
||
| for (std::size_t i = 0; i < len; ++i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why in increments of 1 rather than words?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For address triggers. This for loop checks every address in the accessed range
| value &= mask; | ||
|
|
||
| for (std::size_t i = 0; i < len; ++i) | ||
| if (simple_match(xlen, (value + i) & mask)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the logic here. Why are we adding the current byte offset to the data value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For data value len == 1 so there is no problem here
reg_t value;
if (select) {
if (!data.has_value())
return std::nullopt;
value = *data;
len = 1;
} else {
value = address;
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's very subtle. I agree it appears to be correct, but it probably deserves a comment that the + i only applies in the address case, for the reason you mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the call to the simple_match() can be moved into both branches of if (select) (with the loop in the else branch), dropping the value variable?
I believe this will make the code clearer and will eliminate the hack.
aswaterman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@en-sc Can you help review?
…g specification behavior According to the debug specification (select bit description): 0 (address): There is at least one compare value and it contains the lowest virtual address of the access. In addition, it is recommended that there are additional compare values for the other accessed virtual addresses match. (E.g. on a 32-bit read from 0x4000, the lowest address is 0x4000 and the other addresses are 0x4001, 0x4002, and 0x4003.) 1 (data): There is exactly one compare value and it contains the data value loaded or stored, or the instruction executed. Any bits beyond the size of the data access will contain 0. Previously, when select bit was 0, Spike did not follow the recommendation and provided only 1 matching value to the trigger module. This change modifies the behavior to the recommended one. The implementation follows the debug specification recommendation. Signed-off-by: Farid Khaidari <[email protected]>
966e2af to
b7ed2c1
Compare
|
@aswaterman, this MR may be a little bit confusing. I believe the problem in signature of the original function: void check_triggers(triggers::operation_t operation, reg_t address, bool virt, std::optional<reg_t> data = std::nullopt);In this parameter to be precise: void check_data_triggers(triggers::operation_t operation, reg_t address, bool virt, reg_t data);
void check_addr_triggers(triggers::operation_t operation, reg_t address, bool virt, reg_t tval);Same for So I decided to avoid this splitting. I may introduce it in one of the future MRs |
|
@fkhaidari You're probably right that it would be clearer to split |
@fkhaidari, @aswaterman, unfortunately this will not work. A data trigger can be chained with an address trigger (e.g. a watchpoint on a store of a particular value, see here). |
| } | ||
|
|
||
| std::optional<match_result_t> mcontrol_common_t::detect_memory_access_match(processor_t * const proc, operation_t operation, reg_t address, std::optional<reg_t> data) noexcept { | ||
| std::optional<match_result_t> mcontrol_common_t::detect_memory_access_match(processor_t * const proc, operation_t operation, reg_t address, size_t len, std::optional<reg_t> data) noexcept { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| std::optional<match_result_t> mcontrol_common_t::detect_memory_access_match(processor_t * const proc, operation_t operation, reg_t address, size_t len, std::optional<reg_t> data) noexcept { | |
| std::optional<match_result_t> mcontrol_common_t::detect_memory_access_match(processor_t * const proc, operation_t operation, reg_t address, std::size_t len, std::optional<reg_t> data) noexcept { |
For consistency with the index in the loop.
| value &= 0xffffffff; | ||
| } | ||
| auto mask = make_mask64(0, xlen); | ||
| value &= mask; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest to either:
- Move the masking into the
simple_match(). - Add an assertion in the
simple_match()that the value is masked.
| value &= mask; | ||
|
|
||
| for (std::size_t i = 0; i < len; ++i) | ||
| if (simple_match(xlen, (value + i) & mask)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the call to the simple_match() can be moved into both branches of if (select) (with the loop in the else branch), dropping the value variable?
I believe this will make the code clearer and will eliminate the hack.
riscv/mmu.cc
Outdated
| bytes += sizeof(reg_t); | ||
| auto xlen_bytes = proc->get_xlen(); | ||
| while (len > xlen_bytes) { | ||
| check_triggers(triggers::OPERATION_LOAD, transformed_addr, access_info.effective_virt, xlen_bytes, reg_from_bytes(xlen_bytes, bytes)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I'm not sure this code is correct.
AFAIU, this case can be triggered for a vector memory operation with SEW > XLEN. This case is governed by [5.5.2. Combined Accesses]:
Some instructions lead a hart to perform multiple memory accesses. This includes vector loads and
stores, as well as cm.push and cm.pop instructions. The Trigger Module should match such accesses as
if they all happened individually. E.g. a vector load should be treated as if it performed multiple loads
of size SEW (selected element width), and cm.push should be treated as if it performed multiple stores
of size XLEN.
It seems like there should be a clarification made for the case of a data value being wider then XLEN. For now the select field description only covers sizes less or equal to XLEN.
1 (data): There is exactly one compare value and it contains
the data value loaded or stored, or the instruction
executed. Any bits beyond the size of the data access will
contain 0.
On a side note, do cache management operations come through here too? If so, this needs to be addressed. See [5.5.3. Cache Operations] for details. However, I'd suggest to add a TODO and address it sometime later.
Ah, right. |
According to the debug specification (select bit description):
0 (address): There is at least one compare value and it contains the lowest virtual address of the access. In addition, it is recommended that there are additional compare values for the other accessed virtual addresses match. (E.g. on a 32-bit read from 0x4000, the lowest address is 0x4000 and the other addresses are 0x4001, 0x4002, and 0x4003.)
1 (data): There is exactly one compare value and it contains the data value loaded or stored, or the instruction executed. Any bits beyond the size of the data access will contain 0.
Previously, when select bit was 0, Spike did not follow the recommendation and provided only 1 matching value to the trigger module. This change modifies the behavior to the recommended one.
The implementation follows the debug specification recommendation.