Skip to content

Conversation

@pmatos
Copy link
Collaborator

@pmatos pmatos commented Jul 28, 2025

It attempts to use the IOC ARM64 bit to avoid lengthy checks on operation arguments.
This is the F64 counterpart to 726656d .

Copilot AI review requested due to automatic review settings July 28, 2025 18:11

This comment was marked as spam.

@bylaws
Copy link
Collaborator

bylaws commented Jul 28, 2025

Thank you copilot

@bylaws
Copy link
Collaborator

bylaws commented Jul 28, 2025

So there are a few things to note here:
The IOC bit is cumulative, so it will need to be zeroed at the entrypoint to a block and after any non-x87 float operation, which could set the bit on the arm side but would not on the x86 side.

As is you flush the IOC bit after every operation, it would make sense to only do this at block exit points, or whenever a non-x87 IOC setting instruction is used in a block. I think if you extend the IR json to track IOC setting operations, as the x87 pass already handles every x87 operation directly, you can just check op->SetsIOC in the non-x87 inst case to know when to flush.


// Set x87 invalid operation bit if IOC is set
// Load as 32-bit to avoid size constraint issues
Ref CurrentIE32 = _LoadContext(OpSize::i32Bit, GPRClass, offsetof(FEXCore::Core::CPUState, flags) + FEXCore::X86State::X87FLAG_IE_LOC);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8-bit has a constraint issue on the load side but not the store side? That seems odd.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load8/store8 doesn't work. Lets do load32/store32 which sounds more sensible. Thanks.

@pmatos pmatos force-pushed the feature/x87-invalid-operation-bit_F64 branch 4 times, most recently from bf59bd4 to 9c29587 Compare July 29, 2025 13:38
@pmatos
Copy link
Collaborator Author

pmatos commented Jul 29, 2025

So there are a few things to note here: The IOC bit is cumulative, so it will need to be zeroed at the entrypoint to a block and after any non-x87 float operation, which could set the bit on the arm side but would not on the x86 side.

As is you flush the IOC bit after every operation, it would make sense to only do this at block exit points, or whenever a non-x87 IOC setting instruction is used in a block. I think if you extend the IR json to track IOC setting operations, as the x87 pass already handles every x87 operation directly, you can just check op->SetsIOC in the non-x87 inst case to know when to flush.

Ah, yes, I do understand the issue: both bits are sticky, but the arm bit can be set by many other instructions that are not being emulated from x87 so the possibility of reading a wrong value exists. Need to think of a way to turn that into a unit test though. Need to have a think on how to best handle this case.

@pmatos pmatos force-pushed the feature/x87-invalid-operation-bit_F64 branch 2 times, most recently from e721b2a to ae80805 Compare July 31, 2025 11:38
@pmatos
Copy link
Collaborator Author

pmatos commented Jul 31, 2025

I have rebase this on top of #4743 . Still working on addressing @bylaws earlier suggestion.

@pmatos pmatos force-pushed the feature/x87-invalid-operation-bit_F64 branch from ae80805 to 55aada1 Compare August 18, 2025 15:15
@pmatos
Copy link
Collaborator Author

pmatos commented Aug 19, 2025

hummm, VIXL doesn't support FPSR, so we might need to skip some tests on the simulator.

@pmatos
Copy link
Collaborator Author

pmatos commented Aug 19, 2025

So there are a few things to note here: The IOC bit is cumulative, so it will need to be zeroed at the entrypoint to a block and after any non-x87 float operation, which could set the bit on the arm side but would not on the x86 side.

As is you flush the IOC bit after every operation, it would make sense to only do this at block exit points, or whenever a non-x87 IOC setting instruction is used in a block. I think if you extend the IR json to track IOC setting operations, as the x87 pass already handles every x87 operation directly, you can just check op->SetsIOC in the non-x87 inst case to know when to flush.

Been trying to create a test where emulating a non-x87 invalid op causes, through an indirect setting of the FPSR flag on the arm side, the x87 IE flag to show up as set. I think this is what you were alluding to with "after any non-x87 float operation, which could set the bit on the arm side but would not on the x86 side." but I can't create a test that does this and nothing comes to mind. Any suggestions?

@pmatos
Copy link
Collaborator Author

pmatos commented Aug 20, 2025

So there are a few things to note here: The IOC bit is cumulative, so it will need to be zeroed at the entrypoint to a block and after any non-x87 float operation, which could set the bit on the arm side but would not on the x86 side.
As is you flush the IOC bit after every operation, it would make sense to only do this at block exit points, or whenever a non-x87 IOC setting instruction is used in a block. I think if you extend the IR json to track IOC setting operations, as the x87 pass already handles every x87 operation directly, you can just check op->SetsIOC in the non-x87 inst case to know when to flush.

Been trying to create a test where emulating a non-x87 invalid op causes, through an indirect setting of the FPSR flag on the arm side, the x87 IE flag to show up as set. I think this is what you were alluding to with "after any non-x87 float operation, which could set the bit on the arm side but would not on the x86 side." but I can't create a test that does this and nothing comes to mind. Any suggestions?

OK - found a test that shows the issue mentioned earlier where cumulative FPSR contaminates x87 status word IE flag. 62b625f#diff-2380a45bab6932717d4fe2ae0561fc1a1450f30d8430392c7b3ff4b29958fd5e

@pmatos pmatos force-pushed the feature/x87-invalid-operation-bit_F64 branch 2 times, most recently from 1410a27 to 47fb217 Compare August 26, 2025 08:53
@pmatos
Copy link
Collaborator Author

pmatos commented Aug 28, 2025

Any comments on the current version of the patch? I am slightly worried I missed some instructions that SetsIOC or haven't set it in some other cases, but after reviewing it, I think I got most cases covered.


for (auto [CodeNode, IROp] : IR->GetCode(BlockNode)) {
// Clear FPSR IOC bit before non-x87 operations that can set it
if (FEXCore::IR::SetsIOC(IROp->Op) && !FEXCore::IR::LoweredX87(IROp->Op)) {
Copy link
Member

@Sonicadvance1 Sonicadvance1 Sep 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cost of doing this for every instruction that isn't an x87 instruction(and can set IOC) is too much and needs to be re-thought out to limit the performance impact.

@pmatos pmatos force-pushed the feature/x87-invalid-operation-bit_F64 branch from 47fb217 to 108a199 Compare September 15, 2025 14:13
@pmatos pmatos force-pushed the feature/x87-invalid-operation-bit_F64 branch from 108a199 to f4dc7f3 Compare September 15, 2025 15:31
@pmatos
Copy link
Collaborator Author

pmatos commented Oct 30, 2025

I have a few upcoming changes to x87 stack pass, so I will need to refactor this. moving it to draft.

@pmatos pmatos marked this pull request as draft October 30, 2025 08:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants