Skip to content

Add comprehensive unit tests for score and normalize modules#105

Open
xingzihai wants to merge 2 commits intomvanhorn:mainfrom
xingzihai:add-comprehensive-unit-tests
Open

Add comprehensive unit tests for score and normalize modules#105
xingzihai wants to merge 2 commits intomvanhorn:mainfrom
xingzihai:add-comprehensive-unit-tests

Conversation

@xingzihai
Copy link
Copy Markdown

Summary

This PR adds extensive test coverage for the score and normalize modules with 92 new unit tests.

Changes

Score Module (test_score_comprehensive.py) - 56 tests

Engagement Calculation Functions:

  • TestLog1pSafeEdgeCases - Edge cases for the log1p_safe utility function
  • TestComputeRedditEngagementRawDetailed - Detailed Reddit engagement tests including top comment boost
  • TestComputeXEngagementRawDetailed - X/Twitter engagement with likes dominating
  • TestComputeYouTubeEngagementRaw - YouTube engagement with views dominating
  • TestComputeTikTokEngagementRaw - TikTok engagement with views dominating
  • TestComputeHackerNewsEngagementRaw - HN engagement with points dominating
  • TestComputePolymarketEngagementRaw - Polymarket engagement with volume dominating
  • TestNormalizeTo100EdgeCases - Edge cases for normalize_to_100

Scoring Functions:

  • TestScoreYouTubeItems - YouTube scoring
  • TestScoreTikTokItems - TikTok scoring
  • TestScoreInstagramItems - Instagram scoring
  • TestScorePolymarketItems - Polymarket scoring with different weights
  • TestScoreWebSearchItems - WebSearch scoring with query type awareness

Sorting and Penalties:

  • TestSortItemsWithQueryType - Sort order with query-type-aware tiebreakers
  • TestDateConfidencePenalties - Date confidence affecting scores
  • TestUnknownEngagementPenalty - Unknown engagement penalties

Normalize Module (test_normalize_comprehensive.py) - 36 tests

Date Filtering:

  • TestFilterByDateRange - Comprehensive date filtering including boundary dates and None handling

Normalization Functions:

  • TestNormalizeRedditItemsComprehensive - Complete and minimal Reddit items
  • TestNormalizeXItemsComprehensive - X/Twitter items
  • TestNormalizeYouTubeItemsComprehensive - YouTube items with always-high confidence
  • TestNormalizeTikTokItemsComprehensive - TikTok items
  • TestNormalizeInstagramItemsComprehensive - Instagram items
  • TestNormalizeHackerNewsItemsComprehensive - Hacker News items
  • TestNormalizeBlueskyItemsComprehensive - Bluesky items
  • TestNormalizeTruthSocialItemsComprehensive - Truth Social items
  • TestNormalizePolymarketItemsComprehensive - Polymarket items with volume fallback

Conversion and Consistency:

  • TestItemsToDictsComprehensive - Dictionary conversion for all item types
  • TestCrossTypeNormalization - Cross-type normalization consistency

Test Coverage Highlights

  • All engagement calculation functions for each platform
  • Edge cases for null, zero, and negative values
  • Query type-based penalties and tiebreakers
  • Date confidence and recency scoring
  • All item types in normalization

Testing

All tests pass:

$ python3 -m unittest tests.test_score_comprehensive -v
Ran 56 tests in 0.005s
OK

$ python3 -m unittest tests.test_normalize_comprehensive -v
Ran 36 tests in 0.004s
OK

- Add centralized path configuration in env.py with XDG-compliant defaults
- Support environment variable overrides for all paths:
  - LAST30DAYS_CONFIG_DIR for config directory
  - LAST30DAYS_DATA_DIR for data directory
  - LAST30DAYS_DATABASE_PATH for database file
  - LAST30DAYS_CACHE_DIR for cache directory
  - LAST30DAYS_OUTPUT_DIR for output directory
  - LAST30DAYS_BRIEFS_DIR for briefs directory
- Update all modules to use centralized path functions
- Update UI to dynamically show actual config paths
- Add fallback to temp directories when permission denied
- Update README.md with path customization documentation

This allows users to customize storage locations for different
deployment environments (containers, shared systems, etc.)
This PR adds extensive test coverage for:

**Score Module (test_score_comprehensive.py):**
- All engagement calculation functions (Reddit, X, YouTube, TikTok, Instagram, HN, Bluesky, TruthSocial, Polymarket)
- Edge cases for log1p_safe utility function
- Detailed tests for normalize_to_100 with edge cases
- Scoring functions for all item types including YouTube, TikTok, Instagram, Polymarket
- WebSearch scoring with query type awareness
- Sort order with query-type-aware tiebreakers
- Date confidence penalties
- Unknown engagement penalties

**Normalize Module (test_normalize_comprehensive.py):**
- Comprehensive tests for all item types (Reddit, X, YouTube, TikTok, Instagram, HN, Bluesky, TruthSocial, Polymarket)
- Date filtering edge cases including boundary dates and None dates
- Items_to_dicts conversion for all item types
- Engagement parsing for different platforms
- Cross-type normalization consistency tests

Total: 92 new tests added
@mvanhorn
Copy link
Copy Markdown
Owner

See my reply on #109 covering all your PRs. Can't commit to anything right now with the v3.0 refactor underway, but I promise I'll consider each one once it lands. Appreciate the work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants