feat: Two-tier cache system with bug fixes and refactoring#60
Conversation
Split transaction cache into hot (recent 90 days, 6h refresh) and cold (historical, 30-day refresh) tiers to reduce unnecessary API calls while maintaining data freshness for recent transactions. Key changes: - Add RefreshStrategy enum (NONE, HOT_ONLY, COLD_ONLY, BOTH, ALL) - Implement tier validation methods (is_hot_cache_valid, is_cold_cache_valid) - Add split save/load logic for hot and cold cache files - Integrate partial refresh in app.py (_partial_refresh method) - MTD optimization: skip cold cache when --mtd or --since within 90 days - Bump cache version to 3.0 (old caches auto-dropped) - Add 50 comprehensive tests in test_tiered_cache.py Benefits: - Historical data refreshed every 30 days instead of 6h - Partial refresh fetches only expired tier, loads other from cache - --mtd loads only hot cache for faster startup - Graceful fallback to full fetch if partial refresh fails 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add _load_merchant_cache() for merchant cache loading with error handling - Add _merge_hot_cold_dfs() for deduplication merge logic - Simplify _check_and_load_cache() using helper methods - Unify HOT_ONLY/COLD_ONLY branches in _partial_refresh() Reduces app.py by ~85 lines while maintaining identical functionality. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR implements a sophisticated two-tier cache system that optimizes data fetching by splitting transactions into hot (recent 90 days) and cold (historical) tiers with different refresh intervals. The hot cache refreshes every 6 hours while the cold cache refreshes every 30 days, reducing unnecessary API calls for historical data.
Key changes:
- Introduced
RefreshStrategyenum with five strategies (NONE, HOT_ONLY, COLD_ONLY, BOTH, ALL) for intelligent cache refresh decisions - Implemented separate hot/cold cache files with split save/load logic and merge functionality with deduplication
- Added MTD optimization that skips cold cache loading when queries are within the 90-day hot window
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_tiered_cache.py | Comprehensive test suite with 50 tests covering tier splitting, validation, refresh strategies, merge logic, and data integrity |
| tests/test_cache_manager.py | Updated existing tests for backwards compatibility with two-tier cache structure |
| tests/test_cache.py | Updated integration tests to work with new hot/cold file paths and metadata structure |
| moneyflow/cache_manager.py | Core implementation of two-tier cache with RefreshStrategy enum, tier validation methods, split save/load operations, and merge logic |
| moneyflow/app.py | Integration of partial refresh logic, hot-only optimization for MTD queries, and strategy-based cache loading |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address Copilot review comments on PR #60: - Fix module docstring: "24 hours" → "6 hours" to match HOT_MAX_AGE_HOURS - Remove unused List import from typing - Rename test methods to reflect actual 6-hour expiry policy - Update test values from 25 hours to 7 hours for more precise testing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Three critical fixes for the two-tier cache system: 1. Separate display filters from cache behavior - --mtd and --year now only filter the VIEW, not what's cached - Cache always stores full data (year=None, since=None) - Prevents --mtd from nuking an existing full cache 2. Fix partial refresh API calls - Monarch API requires BOTH startDate and endDate - Was passing None for one date, causing API failure - Added get_hot_refresh_date_range() and get_cold_refresh_date_range() 3. Prevent gaps between cache tiers - Hot refresh now uses cold's latest_date (from metadata) - Cold refresh now uses hot's earliest_date (from metadata) - Both use 7-day overlap (TIER_OVERLAP_DAYS) - Fixes gap that would grow daily as boundary moves Added 12 regression tests covering: - Display filters don't invalidate cache - Partial refresh date ranges always non-None - Tier overlap ensures no gaps 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When running with --mtd or --year, the app loads only filtered (recent) transactions. Previously, committing edits would call save_cache() with this filtered data, causing the cold cache to be overwritten with empty data since all transactions were within the hot window. Now handle_commit_result() detects filtered view mode and uses save_hot_cache() instead, which preserves the cold tier data. - Add is_filtered_view parameter to handle_commit_result() - Use save_hot_cache() when operating on filtered data - Add logging to cache save operations for debugging - Add 3 regression tests to prevent this bug from recurring 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove RefreshStrategy.BOTH (use ALL when both tiers stale) - Remove dead _filter_covered() method and year/since parameters - Simplify get_refresh_strategy() and is_cache_valid() signatures - Rename _merge_dataframes to public merge_tiers() - Add logging to cache save operations Test consolidation: - Merge test_tiered_cache.py and test_cache_manager.py into test_cache.py - Remove ~1700 lines of duplicate test code - Add edge case tests (unicode, large data, corrupt files) - Add display filtering tests 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Show explicit date ranges in status messages during refresh - Partial refresh: "Refreshing recent transactions (2024-09-23 to 2024-12-22)" - Full refresh: "Full refresh: fetching 2025-01-01 to 2024-12-22" - Clearer notifications showing what was fetched vs cached - Remove unused boundary_str variable 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Progress messages now show the date range being fetched, making it clear whether it's a full refresh or a partial cache update. Messages like "Fetching all transactions..." or "Downloading 1,069 transactions (2024-09-15 to 2024-12-22)..." provide better visibility into what the app is doing. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Two fixes for cache continuity: 1. Cold cache now includes 30 days of overlap into the hot window. This prevents gaps when cold expires: after 30 days the boundary moves forward, but cold data still reaches the new boundary. 2. --mtd --refresh (or similar hot-only views) now only refreshes the hot tier, not all historical data. The override logic is kept in app.py where view context exists, not in the cache layer. Added test_cold_cache_has_30_day_overlap to verify the overlap guarantee. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…hants When the merchant cache is empty, pl.Series creates a Series with dtype null, which can't be concatenated with str Series. Fixed by explicitly setting dtype=pl.Utf8 when creating the cached_series. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds _is_cache_structure_valid() that runs before using cached data. If any check fails, forces a full refresh to prevent serving stale or inconsistent data after code changes to cache logic. Checks performed: 1. Required metadata fields exist for both tiers 2. Cold cache extends to boundary (within 7-day tolerance) 3. No gap between cold latest_date and hot earliest_date This is a defensive measure that will auto-heal cache issues from future changes to the cache handling logic. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The merchant search was using str.contains() which interprets the input as regex by default. Characters like * ? ( ) would cause regex parse errors. Fixed by adding literal=True to treat the search pattern as a plain string. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Extracted two pure functions from EditMerchantScreen: - filter_merchants(): Filters merchant Series by query with proper regex escaping (literal=True) - parse_merchant_option_id(): Parses __new__: prefix to distinguish new vs existing merchants Added comprehensive unit tests (13 tests) including: - Case-insensitive matching - Partial string matching - Deduplication and sorting - Limit parameter - Regex special character handling (* ? ( ) + [ ]) - Option ID parsing for new vs existing merchants The regex test would have caught the bug fixed in the previous commit. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When in detail view with sub-grouping enabled, the current_data contains aggregate fields (merchant, count, total) instead of transaction fields (id). This caused a KeyError when trying to access row_data["id"]. Added sub_grouping_mode check to both action_delete_transaction and action_show_transaction_details guards, matching the pattern already used in the hide/unhide toggle code. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
This PR has been quite a lot of bug whackamole. I'm going to use this branch locally for a while until I stop seeing serious bugs before merging |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Move TIER_OVERLAP_DAYS to class constants section - Add GAP_TOLERANCE_DAYS constant (was hardcoded as 7) - Add comments to empty except clauses explaining fallback behavior Addresses code review feedback from Copilot. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
This PR implements a two-tier cache system for optimized data fetching, along with several bug fixes and refactoring improvements discovered during development and testing.
Two-Tier Cache System
Split transaction cache into hot (recent 90 days, 6h refresh) and cold (historical, 30-day refresh) tiers to reduce unnecessary API calls while maintaining data freshness for recent transactions.
Key changes:
RefreshStrategyenum (NONE, HOT_ONLY, COLD_ONLY, BOTH, ALL)is_hot_cache_valid,is_cold_cache_valid)_partial_refreshmethod)--mtdor--sincewithin 90 daysBenefits:
--mtdloads only hot cache for faster startupCache System Hardening
New Module: Cache Orchestrator
Extract cache orchestration logic into
cache_orchestrator.pyfor better separation of concerns and testability. This module handles the coordination between cache manager, data fetching, and refresh strategies.Bug Fixes (Non-Cache Related)
()or.now search correctlyRefactoring
edit_screens.pyinto pure functionsTest Coverage
test_cache.pytest_cache_orchestrator.pywith 224 lines of orchestrator teststest_edit_screens.pywith 130 lines of edit screen logic teststest_app_controller.pytests for sub-grouped view behaviorTest Plan
pyright moneyflow/)--mtdand--refreshflagsFiles Changed
moneyflow/cache_manager.py- Core two-tier cache implementationmoneyflow/cache_orchestrator.py- New orchestration modulemoneyflow/app.py- Integration and UI updatesmoneyflow/app_controller.py- Sub-grouped view fixesmoneyflow/screens/edit_screens.py- Extracted business logicmoneyflow/data_manager.py- Minor cache integration updatestests/test_cache.py- Consolidated cache teststests/test_cache_orchestrator.py- New orchestrator teststests/test_edit_screens.py- New edit screen teststests/test_app_controller.py- New controller tests🤖 Generated with Claude Code