Skip to content

Conversation

@Jay-ju
Copy link
Contributor

@Jay-ju Jay-ju commented Jan 7, 2026

Changes Made

image

Added statistics on the number of input and output lines for each operator on the flotilla, The style is similar to swordfish.

Related Issues

@Jay-ju Jay-ju marked this pull request as draft January 7, 2026 13:35
@github-actions github-actions bot added the feat label Jan 7, 2026
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 7, 2026

Greptile Summary

This PR enhances the progress bar display across both Ray (Flotilla) and Native (Swordfish) runners with detailed metrics tracking. The changes add "rows in" and "rows out" statistics to progress bars, along with improved formatting using ship emoji (🚢) for Ray runner and elapsed time display.

Key changes:

  • Enhanced Python ProgressBar class with ship icon (🚢), checkmark (✓) when finished, and formatted elapsed time display
  • Added HashMap tracking in Rust FlotillaProgressBar to accumulate rows_in and rows_out metrics across task completions
  • Introduced should_record_rows_in() trait method to selectively track input rows for sources like InMemorySource
  • Changed CountingSender.rt visibility to pub(crate) to allow source nodes to record rows_in metrics
  • Removed multi_progress.clear() call to preserve final progress bar state
  • Added comprehensive tests validating progress bar formatting for both runners

The implementation correctly uses mutable borrows in Rust (&mut self in handle_event) for thread-safe HashMap updates.

Confidence Score: 4/5

  • This PR is safe to merge with minor considerations around test reliability in non-TTY environments
  • The implementation is sound with proper thread safety (mutable borrows in Rust), clear separation of concerns, and appropriate visibility changes. The logic correctly tracks metrics and updates progress bars. Score is 4 (not 5) due to: (1) tests may have limited reliability in CI environments without TTY support, as acknowledged in test comments; (2) manual elapsed time calculation bypasses tqdm's built-in elapsed tracking for Ray runner, which could drift if progress bar updates are delayed
  • tests/test_progress_bar.py - tests may not run reliably in non-TTY CI environments

Important Files Changed

Filename Overview
daft/runners/progress_bar.py Enhanced progress bar formatting with ship icon, elapsed time tracking, and rows in/out metrics display
src/daft-distributed/src/python/progress_bar.rs Added metrics tracking (rows in/out) to Flotilla progress bar with proper state management
src/daft-local-execution/src/sources/source.rs Added rows_in tracking to source stats, implemented conditional recording based on source type
tests/test_progress_bar.py Added tests for native and Ray progress bar formatting with metrics validation

Sequence Diagram

sequenceDiagram
    participant Task as Task Execution
    participant Source as SourceNode
    participant CountingSender as CountingSender
    participant Stats as RuntimeStats
    participant Sub as StatisticsSubscriber
    participant PyPB as Python ProgressBar
    
    Note over Task,PyPB: Task Submission & Bar Creation
    Task->>Sub: TaskEvent::Submitted
    Sub->>PyPB: make_bar_or_update_total(bar_id, bar_name)
    PyPB->>PyPB: Create bar with ship icon 🚢[idx]
    
    Note over Task,PyPB: Data Processing & Metrics Collection
    Source->>Source: get_data() returns partitions
    loop For each partition
        Source->>Stats: add_rows_in(part.len())
        Source->>CountingSender: send(partition)
        CountingSender->>Stats: add_rows_out(part.len())
    end
    
    Note over Task,PyPB: Task Completion & Progress Update
    Task->>Sub: TaskEvent::Completed with stats
    Sub->>Sub: Extract rows_in and rows_out from stats
    Sub->>Sub: Accumulate totals in HashMap
    Sub->>PyPB: update_bar(bar_id, "X rows out, Y rows in")
    PyPB->>PyPB: Update description with checkmark ✓
    PyPB->>PyPB: Display elapsed time and metrics
Loading

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 7, 2026

Greptile's behavior is changing!

From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section.

This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR".

@codecov
Copy link

codecov bot commented Jan 7, 2026

Codecov Report

❌ Patch coverage is 16.66667% with 80 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.68%. Comparing base (e06cbec) to head (77b6fbb).
⚠️ Report is 16 commits behind head on main.

Files with missing lines Patch % Lines
daft/runners/progress_bar.py 0.00% 42 Missing ⚠️
src/daft-distributed/src/python/progress_bar.rs 0.00% 38 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5966      +/-   ##
==========================================
+ Coverage   72.52%   72.68%   +0.16%     
==========================================
  Files         970      970              
  Lines      126303   126639     +336     
==========================================
+ Hits        91598    92051     +453     
+ Misses      34705    34588     -117     
Files with missing lines Coverage Δ
src/daft-local-execution/src/runtime_stats/mod.rs 92.22% <ø> (+0.27%) ⬆️
...tion/src/runtime_stats/subscribers/progress_bar.rs 55.95% <ø> (+55.95%) ⬆️
src/daft-local-execution/src/sources/in_memory.rs 88.23% <100.00%> (+1.13%) ⬆️
src/daft-local-execution/src/sources/source.rs 88.50% <100.00%> (+3.84%) ⬆️
src/daft-distributed/src/python/progress_bar.rs 0.00% <0.00%> (ø)
daft/runners/progress_bar.py 0.00% <0.00%> (ø)

... and 56 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Jay-ju Jay-ju marked this pull request as ready for review January 8, 2026 07:16
@Jay-ju
Copy link
Contributor Author

Jay-ju commented Jan 8, 2026

@colin-ho @plotor When you have time, please take a look at the changes here and see if they are reasonable?

@kevinzwang kevinzwang requested a review from colin-ho January 8, 2026 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant