-
-
Notifications
You must be signed in to change notification settings - Fork 171
Description
Describe the bug
Currently, activity monitoring greatly overestimates user watch time by summing the duration of each media file (duration_ms), rather than calculating the actual amount of media a user has watched.
Where this happens
- In
app/activity/monitoring/collectors/jellyfin.py(and other collectors):duration_msis set to the total duration of the media file, not the actual time watched.
- In
app/services/activity/analytics.py:- SQL expressions (like
watch_time_expr) useActivitySession.duration_msto compute total/aggregate watch times for users, content, and dashboards.
- SQL expressions (like
Example problem
If a user opens a 3h20m movie and watches only 10 minutes (or just scrubs), the system will add 3h20m to their watch time. This results in highly inflated watch time stats, user rankings, and dashboards.
Root Cause
The duration_ms field represents the TOTAL FILE DURATION instead of ACTUAL WATCHED TIME, but all analytics code treats it as if it's real watch time. There is currently no reliable field tracking actual time watched per session.
How to Fix
- Add a new field (e.g.
watched_ms,actual_duration_ms) perActivitySessionfor actual watch time. - During session processing (end events, grouping), calculate watched_ms from user progress:
- Use
position_ms(end position), - Or, if not available, use
session_end - session_start - For re-opened/resumed sessions, sum up multiple positions/time deltas (handle re-groups appropriately)
- Use
- Update analytics in
app/services/activity/analytics.pyto sum this new field instead ofduration_ms. - Update all collectors (Jellyfin, Plex, Emby, ABS, etc.) to emit and track actual watchtime.
- Consider potential edge cases: paused, scrubbing, unfinished, or abandoned sessions.
Links to Code (examples)
Jellyfin collector:
wizarr/app/activity/monitoring/collectors/jellyfin.py
Lines 1 to 116 in 5fb4a70
| """ | |
| Jellyfin activity collector using Sessions API polling. | |
| Polls Jellyfin's Sessions API to monitor active playback sessions. | |
| """ | |
| from datetime import UTC, datetime | |
| from typing import Any | |
| from ...domain.models import ActivityEvent | |
| from ..monitor import BaseCollector | |
| class JellyfinCollector(BaseCollector): | |
| """Jellyfin activity collector using Sessions API polling.""" | |
| def __init__(self, server, event_callback): | |
| super().__init__(server, event_callback) | |
| self.active_sessions: dict[str, dict[str, Any]] = {} | |
| def _collect_loop(self): | |
| """Main collection loop using Jellyfin Sessions API polling.""" | |
| self.logger.info("Starting Jellyfin Sessions API polling") | |
| while self.running and not self._stop_event.is_set(): | |
| try: | |
| client = self._get_media_client() | |
| if client: | |
| self.logger.debug("Polling Jellyfin Sessions API...") | |
| sessions = client.now_playing() | |
| if sessions: | |
| self.logger.info(f"Found {len(sessions)} active sessions") | |
| for i, session in enumerate(sessions): | |
| self.logger.debug( | |
| f"Session {i + 1}: {session.get('user_name', 'Unknown')} - {session.get('media_title', 'Unknown')}" | |
| ) | |
| else: | |
| self.logger.debug("No active sessions found") | |
| self._process_sessions(sessions) | |
| else: | |
| self.logger.warning("Failed to get media client for polling") | |
| # Poll every 10 seconds for responsive monitoring | |
| self._stop_event.wait(10) | |
| except Exception as e: | |
| self.logger.error(f"Jellyfin API polling error: {e}", exc_info=True) | |
| self.error_count += 1 | |
| self._stop_event.wait(30) # Wait longer on error | |
| def _process_sessions(self, sessions): | |
| """Process sessions from Jellyfin API and emit events.""" | |
| if not sessions: | |
| # All sessions ended | |
| for session_id in list(self.active_sessions.keys()): | |
| old_session = self.active_sessions.pop(session_id) | |
| self._emit_session_event(old_session, "session_end") | |
| return | |
| current_session_ids = set() | |
| for session_data in sessions: | |
| try: | |
| session_id = session_data.get("session_id", "") | |
| if not session_id: | |
| continue | |
| current_session_ids.add(session_id) | |
| # Check if this is a new session or has changes | |
| if session_id not in self.active_sessions: | |
| # New session | |
| self.active_sessions[session_id] = session_data | |
| self._emit_session_event(session_data, "session_start") | |
| else: | |
| # Check for state changes | |
| old_session = self.active_sessions[session_id] | |
| self.active_sessions[session_id] = session_data | |
| old_state = old_session.get("state", "playing") | |
| new_state = session_data.get("state", "playing") | |
| if old_state != new_state: | |
| if new_state == "paused": | |
| self._emit_session_event(session_data, "session_pause") | |
| else: | |
| self._emit_session_event(session_data, "session_resume") | |
| else: | |
| # Regular progress update | |
| self._emit_session_event(session_data, "session_progress") | |
| except Exception as e: | |
| self.logger.error(f"Failed to process session: {e}", exc_info=True) | |
| # Remove ended sessions | |
| ended_sessions = set(self.active_sessions.keys()) - current_session_ids | |
| for session_id in ended_sessions: | |
| old_session = self.active_sessions.pop(session_id) | |
| self._emit_session_event(old_session, "session_end") | |
| def _emit_session_event(self, session_data: dict[str, Any], event_type: str): | |
| """Convert session data to ActivityEvent and emit.""" | |
| try: | |
| event = ActivityEvent( | |
| event_type=event_type, | |
| server_id=self.server.id, | |
| session_id=session_data.get("session_id", ""), | |
| user_name=session_data.get("user_name", "Unknown"), | |
| media_title=session_data.get("media_title", "Unknown"), | |
| timestamp=datetime.now(UTC), | |
| user_id=session_data.get("user_id"), | |
| media_type=session_data.get("media_type"), | |
| media_id=session_data.get("media_id"), | |
| series_name=session_data.get("series_name"), |
Analytics bug:
wizarr/app/services/activity/analytics.py
Lines 134 to 145 in 5fb4a70
| if db is None: | |
| return self._get_empty_dashboard_stats() | |
| try: | |
| from sqlalchemy import and_, case, extract, func, or_ | |
| from app.models import MediaServer | |
| filters = [] | |
| start_date = None | |
| if days != 0: | |
| start_date = datetime.now(UTC) - timedelta(days=days) |
Additional context
- The dashboard/leaderboards are currently unusable for real user insights because watch times are vastly over-counted.
- Other collectors may have similar logic and need review.
Steps provided are based on code as of Jan 2026.