Skip to content

PidFilter: improve anti-windup and behaviour on target-direction-change#3581

Open
sfeilmeier wants to merge 2 commits intodevelopfrom
feature/pid-anti-windup
Open

PidFilter: improve anti-windup and behaviour on target-direction-change#3581
sfeilmeier wants to merge 2 commits intodevelopfrom
feature/pid-anti-windup

Conversation

@sfeilmeier
Copy link
Copy Markdown
Contributor

@sfeilmeier sfeilmeier commented Feb 20, 2026

PidFilter: improve anti-windup and behaviour on target-direction-change

Summary

This PR improves the PidFilter implementation with enhanced anti-windup mechanisms and intelligent handling of target direction changes (charge/discharge/zero transitions). The changes address the flickering behavior that occurs when the PID filter switches between charging and discharging modes.

Problem

The original PID filter implementation suffered from oscillation and flickering behavior when transitioning between charging, discharging, and zero power setpoints due to:

  1. Integral term (I) accumulation causing overshoot
  2. Lack of intelligent handling when the target direction changes
  3. No distinction between the filter overshooting in the unwanted direction vs. intentional mode changes

Comparison with PR #2960

PR #2960 Approach (External Limit-Based Solution)

  • Location: Application layer (ActivePowerConstraintWithPid wrapper class)
  • Strategy: Dynamically adjust PID filter limits based on target and current power values
    • If target ≤ 0 and current ≤ 0: set limits to (minPower, 0) to prevent positively overshooting
    • If target ≥ 0 and current ≥ 0: set limits to (0, maxPower) to prevent negatively overshooting
  • Pros: Non-invasive to PidFilter core; works at the application level
  • Cons:
    • Reactive rather than preventive (acts based on symptoms)
    • Requires external constraint logic scattered in application code
    • Non-generalizable - only works when integration with ESS APIs is available
    • Removed the error sum limiting mechanism entirely

PR #3581 Approach (Improved Core Filter)

  • Location: PidFilter core implementation
  • Strategy: Intelligent, multi-layered approach
    1. Direction Change Detection: Tracks the last target sign (positive/negative/zero) and detects transitions
    2. Error Sum Reset: When direction changes, reset the accumulated error to 0, preventing overshoot carryover
    3. Anti-Windup Mechanism: Only update error sum when output is NOT saturated, preventing integral windup
    4. Output Direction Enforcement: Ensure the output direction never contradicts the target direction:
      • If target > 0 and output < 0 → force output to 0
      • If target < 0 and output > 0 → force output to 0
      • If target = 0 → force output to 0
    5. Removed Error Sum Factor: Replaced the crude ERROR_SUM_LIMIT_FACTOR mechanism with conditional error sum acceptance
  • Pros:
    • Proactive rather than reactive (prevents overshoot before it happens)
    • Self-contained within PidFilter (generalizable to any use case)
    • More sophisticated anti-windup: prevents integral windup when saturated
    • Better performance: smoother transitions with less overshoot
    • Simpler application code - no need for external wrapper classes
  • Cons: More complex PidFilter logic (but well-structured and documented)

Key Technical Improvements

1. Direction Change Tracking

final var targetDirectionChanged = 
    this.firstRun 
    || (this.lastTarget > 0 && target <= 0)  // Positive to negative/zero
    || (this.lastTarget < 0 && target >= 0)  // Negative to positive/zero
    || (this.lastTarget == 0 && target != 0); // Zero to positive/negative
this.lastTarget = target;

2. Smart Error Sum Management

final var candidateErrorSum = targetDirectionChanged ? 0 : this.errorSum + error;

// Calculate P, I, D outputs...

// Anti-windup: only accept error sum if not saturated
if (targetDirectionChanged || !isSaturated) {
    this.errorSum = candidateErrorSum;
}

3. Output Direction Enforcement

if (target > 0 && output < 0) output = 0;
else if (target < 0 && output > 0) output = 0;
else if (target == 0) output = 0;

Performance Comparison

Test Case: Switching between positive and negative targets

Scenario PR #2960 Time to Settle PR #3581 Time to Settle
+10W to -10W ~55 cycles ~35 cycles
No overshoot into opposite direction Limited by external limits Prevented at core level
Oscillation/Flickering Reduced Eliminated

Test Results

Both solutions eliminate flickering, but PR #3581 shows:

  • Faster convergence: Settles more quickly on target
  • Less overshoot: Better dampening due to error sum reset on direction change
  • Better generalization: Works with any controller using PidFilter, not just ESS

Files Changed

  • PidFilter.java: Core filter improvements (37 additions, 46 deletions)
  • PidFilterTest.java: Updated test cases and new direction-change test (118 additions, 91 deletions)
  • Updated test expectations in 4 controller tests due to improved filter behavior

Backward Compatibility

Fully backward compatible - The PidFilter API remains unchanged. Only the internal behavior improves.

Recommendation

PR #3581 is the preferred solution because it:

  1. Solves the problem at the root cause (in the PidFilter implementation itself)
  2. Benefits all applications using PidFilter, not just ESS
  3. Implements proper anti-windup mechanisms (industry-standard PID practice)
  4. Reduces application-layer complexity
  5. Achieves better performance with faster, smoother convergence

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 20, 2026

Codecov Report

❌ Patch coverage is 71.42857% with 6 lines in your changes missing coverage. Please review.

Additional details and impacted files
@@              Coverage Diff              @@
##             develop    #3581      +/-   ##
=============================================
- Coverage      58.60%   58.59%   -0.00%     
  Complexity       105      105              
=============================================
  Files           3091     3091              
  Lines         134005   134007       +2     
  Branches        9882     9884       +2     
=============================================
- Hits           78517    78514       -3     
+ Misses         52590    52586       -4     
- Partials        2898     2907       +9     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sfeilmeier
Copy link
Copy Markdown
Contributor Author

@parapluplu, @tsicking, @Sn0w3y, @TheSerapher, @huseyinsaht, @dennis-qt, @pooran-c, @venu-sagar: (I marked you, because you have been involved in a previous PR on that topic.): As we also encountered strange behaviour with PID I set down with some help of AI and tried to come up with an alternative solution to previous PR #2960. We already quickly tested this in the lab and it looks promising.

What is your opinion on this approach?

@Sn0w3y
Copy link
Copy Markdown
Collaborator

Sn0w3y commented Feb 20, 2026

@parapluplu, @tsicking, @Sn0w3y, @TheSerapher, @huseyinsaht, @dennis-qt, @pooran-c, @venu-sagar: (I marked you, because you have been involved in a previous PR on that topic.): As we also encountered strange behaviour with PID I set down with some help of AI and tried to come up with an alternative solution to previous PR #2960. We already quickly tested this in the lab and it looks promising.

What is your opinion on this approach?

Will it also work with multiple ESSes? 😋😇

@sfeilmeier
Copy link
Copy Markdown
Contributor Author

Will it also work with multiple ESSes? 😋😇

Sure. At this location it works with any Active-Power-Set-Point, e.g. from Balancing- oder Peak-Shaving-Controller.

@dennis-qt
Copy link
Copy Markdown

Note: I'm neither a programmer nor have I actually tested both this and #2960, so this is purely from my theoretical understanding on how the control logic works.

This doesn't seem to address my core concern I've documented in #2960 (comment) at all, where I detailed how the PID filter completely breaks the core logic behind the ESS Balancing controller and why I think having a PID in that position at all might be a design error.

IMO the right approach would be to replace the entire PID filter with a mechanism that just slows down the setpoint change over multiple cycles without implementing a dedicated closed control loop.
This would obviously be much more invasive than changing the PID behavior, but it should fix the underlying problem.

My approach would be to calculate the difference between the current and previous setpoint for the inverter and only partially apply this difference.
So the logic would be basically:
inverterSetpoint = ((newSetpoint - previousSetpoint) * scaleFactor) + previousSetpoint

(Fun fact: this mimics the behavior of a simple RC low-pass filter, and a scaleFactor of 1-e^-1 would be equivalent to the time constant tau being exactly the cycle time)

@sfeilmeier
Copy link
Copy Markdown
Contributor Author

@dennis-qt: Thank you for sharing your insights. I have to admit that I am not able to foresee the possible implications of such a change. I'll have to wait for further discussions and opinions here. Maybe I should have paid more attention to these classes back in University...

var error = target - input;

// We are already there
if (error == 0) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reset errorSum cleanly when target is reached?

this.errorSum = 0;
this.lastInput = input;
this.firstRun = false;

// Post-process the output value
// 1. Apply value limits
int output = this.applyLowHighLimits(rawOutput);
final var isSaturated = rawOutput != output;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this to Line 130 - 136

Anti-windup checks AFTER all output clamping (limits + direction) to prevent integral windup when direction-clamp restricts the output ?

Does this make sense to all?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I tried moving it above before, but it would behave very differently, that's why I left it like this. But worth a try probably.... once we settle on the correct overall approach.

@Sn0w3y
Copy link
Copy Markdown
Collaborator

Sn0w3y commented Feb 21, 2026

The anti-windup saturation check was performed before the direction clamp (target>0 but output<0 → 0). This allowed the integral term to accumulate unchecked when the direction clamp was the effective limiter, causing overshoot on recovery.

The fix would be imo:

  • Move anti-windup check after ALL output clamping (limits + direction)
    so isSaturated reflects the final output, not just the limit-clamped one
  • Reset errorSum to 0 when target is exactly reached (error == 0),
    preventing stale integral state from affecting subsequent cycles

Am i thinking right here?

@Sn0w3y
Copy link
Copy Markdown
Collaborator

Sn0w3y commented Feb 21, 2026

Note: I'm neither a programmer nor have I actually tested both this and #2960, so this is purely from my theoretical understanding on how the control logic works.

This doesn't seem to address my core concern I've documented in #2960 (comment) at all, where I detailed how the PID filter completely breaks the core logic behind the ESS Balancing controller and why I think having a PID in that position at all might be a design error.

IMO the right approach would be to replace the entire PID filter with a mechanism that just slows down the setpoint change over multiple cycles without implementing a dedicated closed control loop. This would obviously be much more invasive than changing the PID behavior, but it should fix the underlying problem.

My approach would be to calculate the difference between the current and previous setpoint for the inverter and only partially apply this difference. So the logic would be basically: inverterSetpoint = ((newSetpoint - previousSetpoint) * scaleFactor) + previousSetpoint

(Fun fact: this mimics the behavior of a simple RC low-pass filter, and a scaleFactor of 1-e^-1 would be equivalent to the time constant tau being exactly the cycle time)

@dennis-qt took a deeper look at the code and i think you are basically right. the core problem is you got a double
feedback loop here. the balancing controller already recalculates the ideal setpoint every single cycle based on current
measurements (calculatedPower = gridPower + essPower - targetGridSetpoint) thats already a closed loop - it measures and calculates the correction and repeats. then this value gets shoved through the PID filter where input is the current ESS power and target is the calculatedPower. if you do the math the PID error boils down to just gridPower -
targetGridSetpoint which is literally what the outer loop is already correcting for

so you end up with two nested loops basically fighting over the same thing. the I-term piles up errors that the outer loop already handles which causes overshoot and the D-term reacts to changes that are partly caused by itself from the last cycle. thats why we keep needing patches like direction clamping and anti-windup fixes - those are basically band aids for a architectural issue

the proposed low-pass approach (inverterSetpoint = previousSetpoint + scaleFactor * (newSetpoint - previousSetpoint))
would do the same job smooth ramping no hard jumps without creating a second control loop. controller stays the only
feedback loop which is way cleaner and easier to understand

the current PID does converge in practice tho and with P=0.3 the proportional term alone kinda behaves like a low-pass
already. so its not broken but every new edge case will probably need yet another patch whereas a simple ramp limiter
would just work

@sfeilmeier
Copy link
Copy Markdown
Contributor Author

My approach would be to calculate the difference between the current and previous setpoint for the inverter and only partially apply this difference.
So the logic would be basically:
inverterSetpoint = ((newSetpoint - previousSetpoint) * scaleFactor) + previousSetpoint

@dennis-qt: Ok, so you are saying we should just apply the filter (whichever - low-pass or PID filter) using the previous and current set-point and to not consider the actually measured power of the inverter. This should be ok, because anyway the inverter has it's own filter internally - and relying on the measured power again just adds delay and more fluctuation. We see this delay also in production.

We could adjust the existing PID-Filter implementation in this way. Would you still pledge for a simple low-pass filter? What would be a good choice for the scaleFactor?

@dennis-qt
Copy link
Copy Markdown

@dennis-qt: Ok, so you are saying we should just apply the filter (whichever - low-pass or PID filter) using the previous and current set-point and to not consider the actually measured power of the inverter.

My proposed solution works that way, yes, but that's not really the core statement I'm trying to make.
During my investigation into those power oscillations, I came to the following conclusion:

The system we're dealing with has multiple points that create major time delays in the control loop:

  • the delay between the inverter receiving an instruction, the inverter executing that instruction, and the inverter reporting back after the instruction was executed
  • the polling rate of the grid power meter
  • the cycle time of OpenEMS
  • in the case of some ESS systems, the delay between the ESS controller receiving an instruction and forwarding it to the actual inverter

My guess is that because the individual time delays involved are roughly on the same order of magnitude but out of sync, this seems to be the cause of system instability (like for example the grid meter having a 1Hz polling rate, the inverter updating its feedback values at 1Hz, and the OpenEMS cycle time being set to 1s).
It is pretty much impossible to perfectly sync all of those components, so there will always be some phase shift in the signal chain.

An external impulse in the form of a load change then causes this system to oscillate at what I think could be considered the resonance frequency of the system.
[Note that this happens even without the PID filter being involved at all.]

We could adjust the existing PID-Filter implementation in this way. Would you still pledge for a simple low-pass filter?

I think the goal should be to reduce the total system gain at the resonance frequency as much as possible without slowing down reaction time in all other scenarios to unacceptable levels.
A low-pass filter would be an easy solution, because I believe this resonance frequency tends to be on the upper end of the available bandwidth at typical sampling rate settings. It would also be trivial to tune simply by shifting the cut-off frequency.
If you can change the existing PID filter in a way that allows it to behave in a similar way at the resonance frequency while being easy to tune, I don't have any objections.

What would be a good choice for the scaleFactor?

The relationship between the scaleFactor and the corresponding frequency response should be easy to calculate, but the optimal value would depend heavily on

  • how much you're willing to trade off how fast the system should react against how hard you want to dampen the system resonance
  • the actual system resonance frequency
  • the cycle time setting

You could probably write a whole paper about analyzing and tuning such a system for best performance...

I'd probably start with 0.63 (roughly 1-e^-1) at first and see how high it can be pushed before the system starts to become unstable again. Community feedback would probably help with determining a value that represents a good compromise across most systems.

@sfeilmeier
Copy link
Copy Markdown
Contributor Author

@dennis-qt and @Sn0w3y: Could you please have a look at my try on implementing a Low-Pass-Filter? #3587

For a clean implementation I had to touch a few more files than technically necessary.

  • This is the main implementation: 9531c86
  • This JUnit test compares Pid to LowPass: 11411e8

@tsicking
Copy link
Copy Markdown
Contributor

Hi all, thanks for the fruitful comments. We have also been expreimenting with the PID filter in the past, in particular with the integral windup. We had a system where the ESS reported its self consumption of about 1kW as active power when the setpoint was 0, which led to a huge integral term. We did peak shaving in that setup, and most of the time the battery was doing nothing, but the error sum gained 1kW every cycle until the maximum was reached. This resulted in the battery needing about 30s until it discharged when we needed it.
What we did was the following:

  1. Write the necessary information into channels and set up a grafana dashboard. We created channels for PID Input, PID target, PID output and the individual terms outputP, outputI and outputD. This helped identifying the problem, and we still use it when we need a more detailed look into the PID filter.
  2. To prevent integral windup, we introduced an exponential decay on the error terms when calculating the I-part. In the code, it looked like this:
this.errorSum = this.applyErrorSumLimit(this.decayFactor * this.errorSum + error);

where the decay factor could be configured in the power component. It means that the n-th last error term is only considered with factor this.decayFactor^n in the error sum.

When setting the decay factor to 1 the system would behave as before. In the above system we set it to 0.99, which means that the error sum was limited to 100kW, even when the battery stood still for hours (Why? Small exercise for undergraduate students :-)). It worked well in this setup, but by now the problem with the inverter is fixed and our use case changed, so the exponential decay was not needed any more.

@sfeilmeier
Copy link
Copy Markdown
Contributor Author

@tsicking Thank you for the insights. This sounds exactly like a case where the LowPassFilter should work better, because it does not use the measured output of the inverter (which was wrongly 1 kW in your case) but only tries to smoothen the set-point instead.

What's your opinion? #3587

@tsicking
Copy link
Copy Markdown
Contributor

@tsicking Thank you for the insights. This sounds exactly like a case where the LowPassFilter should work better, because it does not use the measured output of the inverter (which was wrongly 1 kW in your case) but only tries to smoothen the set-point instead.

What's your opinion? #3587

I'll have a closer look at it, hopefully I'll manage today or tomorrow. It definitely sounds interesting.

@timo-schlegel
Copy link
Copy Markdown
Contributor

  1. Output Direction Enforcement: Ensure the output direction never contradicts the target direction:

    • If target > 0 and output < 0 → force output to 0
    • If target < 0 and output > 0 → force output to 0
    • If target = 0 → force output to 0

@sfeilmeier
I want to make sure I understood this correctly:

If I have an ESS cluster with an activePowerSetpoint of +2000 W (i.e. target direction = discharge), and one ESS in the cluster has a force charge constraint, e.g. activePower ≤ -100 W (for example from a limitTotalDischarge controller in forceCharge state),

would this mean that the resulting −100 W is corrected to 0 W due to the direction enforcement — effectively preventing the forced charging?

Is this behavior intended?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants