Skip to content

gcn.rs: reverse iteration over sclk and mclk levels#15

Closed
jokubasver wants to merge 1 commit intoilya-zlobintsev:masterfrom
jokubasver:vega-voltage-fix
Closed

gcn.rs: reverse iteration over sclk and mclk levels#15
jokubasver wants to merge 1 commit intoilya-zlobintsev:masterfrom
jokubasver:vega-voltage-fix

Conversation

@jokubasver
Copy link
Copy Markdown

Writing the table in ascending order (P0 -> P7) can cause the later high P-state writes to be rejected/clamped by the driver because earlier states were still at higher voltages at the moment the write is processed. This manifests as P7 not going below a floor even though the OD table entry is updated.

Reverse iteration (P7 -> P0) over sclk and mclk levels to ensure that when undervolting, each subsequent write never attempts to set a voltage below a previously committed higher P-state voltage, preventing the driver from rejecting the requested value.

Fixes ilya-zlobintsev/LACT#933

Writing the table in ascending order (P0 -> P7) can cause the later high P-state writes to be rejected/clamped by the driver because earlier states were still at higher voltages at the moment the write is processed. This manifests as P7 not going below a floor even though the OD table entry is updated.

Reverse iteration (P7 -> P0) over sclk and mclk levels to ensure that when undervolting, each subsequent write never attempts to set a voltage below a previously committed higher P-state voltage, preventing the driver from rejecting the requested value.

Fixes ilya-zlobintsev/LACT#933
@ilya-zlobintsev
Copy link
Copy Markdown
Owner

Clarification on the behaviour (since I couldn't reproduce the issue) - are you trying to write a curve that looks something like this:

  • 0 350mhz 800mv
  • 1 600mhz 800mv
  • ...
  • 6 1300mhz 1100mv
  • 7 1500mhz 1050mv

Which gets rejected or clamped by the driver?

If so, I think it should be automatically clamped in the code (no point should have a lower voltage than any of the earlier points) before it is written.
Even if applying the states in reverse works around the issue, I wouldn't rely on this behaviour, since it may be different between various GPUs, and a decreasing voltage curve makes no sense anyway.

@jokubasver
Copy link
Copy Markdown
Author

No, the curve rises normally:
image

@jokubasver
Copy link
Copy Markdown
Author

If you are willing to try the VBIOS I'm using, try flashing this to your Vega 56: https://www.techpowerup.com/vgabios/198265/powercolor-rxvega64-8192-180108

My card is a reference Vega 56 with Samsung HBM - the above linked VBIOS was the most stable and best performant one I've tried. Maybe it's a quirk with this VBIOS.

@ilya-zlobintsev
Copy link
Copy Markdown
Owner

In this case, I don't understand why this change fixes your problem, could you please explain it?
When writing an undervolt in normal order, you're already writing from lowest to highest voltage, you'd never be writing a value lower than one that was already written before. If anything, applying an undervolt in reverse order sounds like it could cause the issue (if you are writing the highest point first on top of the default curve, you might be trying to apply a voltage in p-state 7 that's lower than default p-state 6).

Also, when the clamping/rejection happens currently, does anything get written in dmesg?

@jokubasver
Copy link
Copy Markdown
Author

jokubasver commented Apr 11, 2026

Now that you say it... yeah, it doesn't make sense, but it works for whatever reason. I tested multiple times. Default ascending order - P7 is at 1.1V. Descending order - P7 is at the set 1.037V.

Nothing in dmesg.

Testing again, it's enough to do descending order for mclk levels only - this sets the P7 voltage correctly for me too.

@jokubasver
Copy link
Copy Markdown
Author

I think the actual issue is a quirk with the order of sclk and mclk, and if they are applied independently (first only sclk P0-P7, then mclk P0-P3), or interleaved (P0 sclk+mclk, P1 sclk+mclk, P2 sclk+mclk, P3 sclk+mclk, P4 sclk, P5 sclk, etc etc).

The reverse order of mclk application was probably just a fluke and it worked-around my issue just by chance. Sorry for the confusion.

As an experiment, I've tried an interleaved approach (set P0 for BOTH sclk and mclk, then P1 for both sclk and mclk, etc etc) - this way I can confirm that P7 does get applied correctly for me to my set value of 1037mv.

impl ClocksTable for Table {
    fn write_commands<W: Write>(
        &self,
        writer: &mut W,
        _previous_table: &ClocksTableGen,
    ) -> Result<()> {

        let shared_len = std::cmp::min(self.sclk_levels.len(), self.mclk_levels.len());

        for i in 0..shared_len {
            let s_command = level_command(self.sclk_levels[i], i, 's');
            writer.write_all(s_command.as_bytes())?;

            let m_command = level_command(self.mclk_levels[i], i, 'm');
            writer.write_all(m_command.as_bytes())?;
        }

        for i in shared_len..self.sclk_levels.len() {
            let s_command = level_command(self.sclk_levels[i], i, 's');
            writer.write_all(s_command.as_bytes())?;
        }

        Ok(())
    }

Let me know what you think.

@ilya-zlobintsev
Copy link
Copy Markdown
Owner

That also seems weird, but theoretically shouldn't break things? I can test if that works fine on my card.

@jokubasver
Copy link
Copy Markdown
Author

Please do. If it works fine for you, I will close this PR and open a new one.

@ilya-zlobintsev
Copy link
Copy Markdown
Owner

I've tested the approach, and it works fine on my card. Unfortunately I don't have a Polaris card to test this on, as that would be another generation this change affects, but hopefully it's fine. Change in #16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Vega 56: GPU voltage cannot go down lower than 1.1V in P7 state

2 participants