Fix stale grace state after plan upgrades and correct threshold semantics#21
Fix stale grace state after plan upgrades and correct threshold semantics#21
Conversation
…tics This PR fixes several related bugs in the grace period and enforcement logic that were causing incorrect UI states after plan changes and at limit boundaries. ## Problem 1: Stale grace warnings after plan upgrade When a user exceeded their plan limit (e.g., 111/100 licenses), a grace period would start and an `exceeded_at` timestamp would be saved. If the user then upgraded to a higher plan (e.g., 500 licenses), the grace warning would persist even though 111/500 is well under the new limit. **Root cause**: The `exceeded_at` flag was never cleared when usage dropped below the limit. The system only checked if `exceeded_at` existed, not whether the user was *currently* exceeded. ## Problem 2: Grace triggering at exact limit (not over) For `grace_then_block` policies, grace periods were starting when usage reached the exact limit (e.g., 100/100) instead of when it went over (101/100). This meant users couldn't fully use their allocation before grace began. **Root cause**: The threshold check used `>=` uniformly for all policies, but `grace_then_block` semantics require `>` (strictly over). ## Problem 3: Missing grace when usage increases via non-callback paths If usage increased through status changes, bulk imports, or manual DB updates (bypassing ActiveRecord callbacks), no grace state would be created even when over limit. The dashboard would show the user as over limit but without the grace period UI. **Root cause**: Grace state was only created via the `after_create` callback. There was no mechanism to lazily create grace when checking status. ## Solution ### 1. Self-healing state (clear stale flags) Added `clear_exceeded_flags!` helper that clears `exceeded_at` and `blocked_at` when usage drops below the limit. This is called as a side effect of: - `grace_active?` - if state exists but usage is now under limit - `should_block?` - if state exists but usage is now under limit This "self-healing" approach means stale state is automatically cleaned up whenever the system checks the current status, without requiring explicit cleanup after every plan change. ### 2. Semantic threshold logic (`exceeded_now?`) Created a shared `exceeded_now?` helper with policy-aware logic: - `:grace_then_block` → uses `>` (over limit triggers grace) - `:block_usage` → uses `>=` (at limit blocks next creation) - `:just_warn` → uses `>=` (at limit shows warning) This is now consistent across `GraceManager`, `StatusContext`, and `Callbacks`. ### 3. Lazy grace creation in StatusContext Added `should_lazily_start_grace?` check to `grace_active?`. If usage is over limit for a `grace_then_block` policy but no enforcement state exists, grace is started on-demand. This handles edge cases where callbacks were bypassed. ### 4. DRY with ExceededStateUtils module Extracted shared logic into a new `ExceededStateUtils` module included by both `GraceManager` (class methods) and `StatusContext` (instance methods). This ensures consistent behavior and makes the codebase more maintainable. ## Files Changed - `lib/pricing_plans/exceeded_state_utils.rb` - NEW: shared utilities - `lib/pricing_plans/grace_manager.rb` - self-healing + threshold fix - `lib/pricing_plans/status_context.rb` - self-healing + lazy grace - `lib/pricing_plans/callbacks.rb` - threshold fix for grace trigger - `README.md` - documented self-healing and semantic enforcement - Tests updated to match new semantics with 7 new test cases ## Testing - All 538 tests pass - Stress tested in production-like environment with ~5000 licenses - Verified plan upgrade/downgrade correctly clears/triggers grace - Verified at-limit vs over-limit distinction works correctly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Code ReviewThis is a well-motivated PR that fixes real user-facing bugs. The self-healing state approach is more robust than callback-based cleanup, and the Potential Bugs1. Command-Query Separation violation in Both methods now perform DB writes (clearing flags) as a side effect of a read. This is the biggest architectural concern. Callers — especially views that call The self-healing is valuable, but consider separating concerns:
Alternatively, if the write-on-read is intentional, it should be clearly documented on the method signature and the performance implication acknowledged. 2. Race condition in lazy grace creation ( unless state&.exceeded?
if should_lazily_start_grace?(limit_key) # check
GraceManager.mark_exceeded!(...) # actThe 3. Extra DB query for non-exceeded users in unless exceeded
if (state = fresh_state_or_nil(plan_owner, limit_key))
clear_exceeded_flags!(state)
end
return false
endThe 4. Reconciliation script in PR description is policy-unaware The optional cleanup script clears state when # As written - policy-unaware:
if current_usage <= limit_amount.to_i
# Should be:
unless exceeded_now?(current_usage, limit_amount, after_limit: limit_config[:after_limit])Design Observations5.
6. Both methods resolve config, check def should_lazily_start_grace?(limit_key)
limit_config = limit_config_for(limit_key)
return false unless limit_config&.dig(:after_limit) == :grace_then_block
currently_exceeded?(limit_key)
end7. Changed from Minor
Summary
|
|
Thanks for the thorough review! A few responses: 1. Command-Query Separation / Write-on-read: Intentional design decision. The writes only happen when state is stale (rare), and 2. Race condition in lazy grace: The 3. Extra DB query in 4. Reconciliation script policy-awareness: Good catch! Updated the PR description with a policy-aware version that uses the 5-7: Minor suggestions that don't justify the added complexity. The current implementation is clear and well-tested. Ready to merge! |
Summary
This PR fixes several related bugs in the grace period and enforcement logic that were causing incorrect UI states after plan changes and at limit boundaries.
exceeded_at/blocked_atflags are automatically cleared when usage drops below the limitgrace_then_blocknow triggers at>(over limit), not>=(at limit)ExceededStateUtilsmoduleProblems & Motivation
Problem 1: Stale grace warnings persisted after plan upgrades
Scenario: User on Hobby plan (100 license limit) creates 111 licenses, triggering a grace period. User then upgrades to Business plan (500 license limit). Despite being at 111/500, the grace warning banner continued showing.
Impact: Users saw alarming "grace period ending" warnings even after upgrading and being well under their new limits. This created confusion and undermined trust in the upgrade path.
Root cause: The system saved
exceeded_atwhen grace started but never cleared it when usage dropped below the limit. Checks only verified the flag existed, not whether the user was currently exceeded.Problem 2: Grace started at exact limit instead of over limit
Scenario: User on a plan with 5 project limit creates their 5th project (5/5). Grace period would incorrectly start at this point.
Impact: Users couldn't fully use their allocation. Grace warnings appeared prematurely, making limits feel more restrictive than intended.
Expected behavior: For
grace_then_block, grace should start when going over (6/5), not when reaching (5/5). The distinction matters:Problem 3: Missing grace when usage increased via non-callback paths
Scenario: Admin bulk-imports licenses via SQL, or license status changes from
suspendedtoactive(incrementing active count). Usage goes over limit but no grace period starts.Impact: Dashboard would show "111/100 licenses" with blocked severity but no grace period UI, because the
after_createcallback never fired.Solution:
StatusContext.grace_active?now lazily creates grace state when it detects over-limit usage forgrace_then_blockpolicies:Architecture: Self-Healing State
Rather than requiring explicit cleanup after every plan change, the system now "self-heals" by checking current usage whenever status is queried:
This approach is more robust than callback-based cleanup because:
New Module: ExceededStateUtils
Extracted shared logic to ensure consistent behavior:
Included by both
GraceManager(class methods) andStatusContext(instance methods).Test Plan
test_on_grace_start_does_not_fire_at_exact_limittest_grace_active_clears_state_when_usage_is_below_limittest_should_block_clears_stale_block_flags_when_usage_is_below_limittest_grace_active_returns_false_when_state_exists_but_usage_is_below_limittest_grace_active_returns_false_at_exact_limit_for_grace_then_blockMigration Notes
No database migrations required. The fix is purely in application logic.
Stale state cleanup: Existing stale enforcement states will be automatically cleaned up the first time they're checked after this deploy. You can optionally run a reconciliation script to proactively clear stale states:
🤖 Generated with Claude Code