Skip to content

Conversation

@Jay-ju
Copy link
Contributor

@Jay-ju Jay-ju commented Jan 11, 2026

  1. Frontend Enhancements:

    • All Queries Page: Updated table header to use white background (bg-white) with black text and grey separators, improving readability.
    • Query Detail Page:
      • Added Entrypoint (command line) and Engine (Swordfish/Flotilla) fields to the metadata section.
      • Added a direct link to the Ray Dashboard for Ray-based queries.
      • Improved metadata visibility by using high-contrast text (text-zinc-100).
      • Progress Table: Refined table headers with dark theme (bg-zinc-800), white text, and clear column separators. Added hover effects for better interactivity.
    • Engine Naming: Standardized engine display names (Native -> Swordfish, Ray -> Flotilla).
  2. Backend Fixes & Improvements:

    • State Management: Fixed an issue where failed Ray queries were not correctly reporting their terminal state to the dashboard (causing 400 errors). Now allows transitions to Failed state from active states.
    • Metadata Propagation: Updated RayRunner to capture and transmit entrypoint and ray_dashboard_url to the dashboard backend.
    • Python API: Exposed repr_json on DistributedPhysicalPlan in init.pyi to fix mypy errors and support plan visualization.
  3. Code Cleanup:

    • Removed unused imports and debug logging.
    • Standardized sys and os imports in ray_runner.py.
    • Fixed mypy type definition errors in daft/init.pyi related to context notification methods.

Changes Made

image image

Related Issues

1.  **Frontend Enhancements**:
    *   **All Queries Page**: Updated table header to use white background (bg-white) with black text and grey separators, improving readability.
    *   **Query Detail Page**:
        *   Added Entrypoint (command line) and Engine (Swordfish/Flotilla) fields to the metadata section.
        *   Added a direct link to the **Ray Dashboard** for Ray-based queries.
        *   Improved metadata visibility by using high-contrast text (text-zinc-100).
        *   **Progress Table**: Refined table headers with dark theme (bg-zinc-800), white text, and clear column separators. Added hover effects for better interactivity.
    *   **Engine Naming**: Standardized engine display names (Native -> Swordfish, Ray -> Flotilla).

2.  **Backend Fixes & Improvements**:
    *   **State Management**: Fixed an issue where failed Ray queries were not correctly reporting their terminal state to the dashboard (causing 400 errors). Now allows transitions to Failed state from active states.
    *   **Metadata Propagation**: Updated RayRunner to capture and transmit entrypoint and ray_dashboard_url to the dashboard backend.
    *   **Python API**: Exposed repr_json on DistributedPhysicalPlan in __init__.pyi to fix mypy errors and support plan visualization.

3.  **Code Cleanup**:
    *   Removed unused imports and debug logging.
    *   Standardized sys and os imports in ray_runner.py.
    *   Fixed mypy type definition errors in daft/__init__.pyi related to context notification methods.
@github-actions github-actions bot added the feat label Jan 11, 2026
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 11, 2026

Greptile Overview

Greptile Summary

This PR enhances the Daft dashboard with improved UI and fixes critical state reporting issues for Ray queries.

Key Changes

Backend Improvements:

  • Fixes Ray runner state management by allowing terminal state (Failed/Canceled) transitions from active states (Executing, Setup, Optimizing), resolving 400 errors when queries fail
  • Changes timestamp precision from u64 to f64 throughout the stack for millisecond-level accuracy
  • Adds runner, ray_dashboard_url, and entrypoint fields to query metadata for better tracking
  • Removes the Ray runner restriction from dashboard subscriber
  • Adds comprehensive query lifecycle notifications (notify_exec_start, notify_exec_end, notify_exec_operator_start, etc.)

Frontend Enhancements:

  • Adds Duration, Entrypoint, Engine, and Ray UI columns to the queries table
  • Implements direct Ray Dashboard links for Ray-based queries with job ID appending
  • Improves table styling with white headers, better borders, and hover effects
  • Standardizes engine naming (Native → Swordfish, Ray → Flotilla)
  • Enhances timestamp formatting to show milliseconds

Code Quality:

  • Exposes repr_json() on DistributedPhysicalPlan (currently returns dummy JSON)
  • Updates Python type stubs to match new API

Implementation Notes

The core fix addresses a state machine issue where Ray queries that failed couldn't transition to the Failed state, causing backend 400 errors. The solution makes plan_info and exec_info optional in Failed/Canceled states and allows transitions from any active state (lines 330-346 in engine.rs).

The Ray dashboard URL extraction uses ray.worker.get_dashboard_url() and attempts to append the job ID when available, falling back gracefully on errors.

Minor Issues

All findings are non-blocking style/documentation issues (see inline comments for details).

Confidence Score: 4/5

  • Safe to merge with minor style improvements recommended
  • The core functionality changes are sound: the state transition fix properly addresses the Ray query failure reporting issue, metadata propagation is implemented consistently across the stack, and frontend changes are purely additive UI enhancements. The timestamp precision change from u64 to f64 is handled correctly throughout. However, there are several minor style issues: inline import in native_runner.py violates project guidelines, misleading comment about commented-out code that actually executes, debug logging left in production code, @ts-ignore suppressing type errors, and undocumented gravitino import removal. These are all non-blocking style/cleanup issues that don't affect correctness.
  • daft/runners/native_runner.py (inline import and misleading comment), daft/init.py (undocumented gravitino change), src/daft-dashboard/frontend/src/app/queries/page.tsx (@ts-ignore)

Important Files Changed

File Analysis

Filename Score Overview
daft/runners/native_runner.py 3/5 Adds entrypoint tracking and query lifecycle notifications; contains inline import violation and misleading comment about code that is actually executing
daft/runners/ray_runner.py 4/5 Adds comprehensive query lifecycle tracking with Ray dashboard URL extraction and proper error handling
daft/init.py 4/5 Comments out gravitino imports (unrelated change not mentioned in PR description)
src/daft-dashboard/src/engine.rs 4/5 Changes timestamps to f64, adds new metadata fields, relaxes state transition requirements for terminal states, includes debug logging
src/daft-dashboard/src/state.rs 5/5 Updates state structs to use f64 timestamps and makes plan_info/exec_info optional for Failed/Canceled states
src/daft-dashboard/frontend/src/app/queries/page.tsx 3/5 Adds new columns for duration, entrypoint, engine, and Ray UI link; includes @ts-ignore for type error

Sequence Diagram

sequenceDiagram
    participant User
    participant Runner as Runner (Native/Ray)
    participant Context as DaftContext
    participant Subscriber as DashboardSubscriber
    participant Backend as Dashboard Backend
    participant Frontend as Dashboard Frontend
    
    User->>Runner: Execute query
    Runner->>Context: _notify_query_start(query_id, metadata)
    Note over Runner: metadata includes runner, entrypoint, ray_dashboard_url
    Context->>Subscriber: on_query_start(query_id, metadata)
    Subscriber->>Backend: POST /query/{id}/start
    Backend->>Frontend: WebSocket update
    
    Runner->>Context: _notify_optimization_start(query_id)
    Context->>Subscriber: on_optimization_start(query_id)
    Subscriber->>Backend: POST /query/{id}/plan/start
    Backend->>Frontend: WebSocket update (status: Optimizing)
    
    Runner->>Runner: Optimize plan
    Runner->>Context: _notify_optimization_end(query_id, optimized_plan)
    Context->>Subscriber: on_optimization_end(query_id, plan)
    Subscriber->>Backend: POST /query/{id}/plan/end
    Backend->>Frontend: WebSocket update (status: Setup)
    
    Runner->>Context: _notify_exec_start(query_id, physical_plan)
    Context->>Subscriber: on_exec_start(query_id, physical_plan)
    Subscriber->>Backend: POST /query/{id}/exec/start
    Backend->>Frontend: WebSocket update (status: Executing)
    
    loop For each result
        Runner->>Context: _notify_exec_emit_stats(query_id, node_id, stats)
        Context->>Subscriber: on_exec_emit_stats(query_id, stats)
        Subscriber->>Backend: POST /query/{id}/exec/op/{op_id}/emit_stats
        Backend->>Frontend: WebSocket update (progress data)
    end
    
    alt Success
        Runner->>Context: _notify_query_end(query_id, Finished)
        Context->>Subscriber: on_query_end(query_id, result)
        Subscriber->>Backend: POST /query/{id}/end (Finished)
        Backend->>Frontend: WebSocket update (status: Finished)
    else Failure
        Runner->>Context: _notify_query_end(query_id, Failed)
        Context->>Subscriber: on_query_end(query_id, result)
        Subscriber->>Backend: POST /query/{id}/end (Failed)
        Note over Backend: Accepts Failed from Executing state
        Backend->>Frontend: WebSocket update (status: Failed)
    end
    
    Frontend->>User: Display query status and Ray dashboard link
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile


# Optimize the logical plan.
ctx._notify_query_start(query_id, PyQueryMetadata(output_schema._schema, builder.repr_json()))
import sys
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] Import statements should be at the top of the file. Move import sys to the imports section at the beginning of the file (around line 3-4) per the project's import style guidelines.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: daft/runners/native_runner.py
Line: 99:99

Comment:
[P2] Import statements should be at the top of the file. Move `import sys` to the imports section at the beginning of the file (around line 3-4) per the project's import style guidelines.

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +126 to +129
# Try to emit stats for operator 0?
# If NativeExecutor doesn't start operator 0, this might fail or be ignored.
# Let's comment it out for now to see baseline behavior.
ctx._notify_exec_emit_stats(query_id, 0, {"rows in": total_rows, "rows out": total_rows})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P3] The comment says "Let's comment it out for now to see baseline behavior" but the code on line 129 is actually executing. Either remove this misleading comment or comment out the code if it was intended to be disabled.

Prompt To Fix With AI
This is a comment left during a code review.
Path: daft/runners/native_runner.py
Line: 126:129

Comment:
[P3] The comment says "Let's comment it out for now to see baseline behavior" but the code on line 129 is actually executing. Either remove this misleading comment or comment out the code if it was intended to be disabled.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +174 to +182
tracing::info!("Received exec_start for query {}", query_id);
let query_info = state.queries.get_mut(&query_id);
let Some(mut query_info) = query_info else {
tracing::error!("Query {} not found in exec_start", query_id);
return StatusCode::BAD_REQUEST;
};

// Debug state
tracing::info!("Query {} state: {:?}", query_id, query_info.state);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P3] Debug logging statements appear to have been left in from development. The comment "Debug state" at line 181 and tracing::info! calls at lines 174 and 182 look like temporary debugging code. Consider removing or converting to tracing::debug! if this level of verbosity isn't needed in production.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/daft-dashboard/src/engine.rs
Line: 174:182

Comment:
[P3] Debug logging statements appear to have been left in from development. The comment "Debug state" at line 181 and `tracing::info!` calls at lines 174 and 182 look like temporary debugging code. Consider removing or converting to `tracing::debug!` if this level of verbosity isn't needed in production.

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +123 to +124
// @ts-ignore
columnHelper.accessor("ray_dashboard_url", {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] Using @ts-ignore suppresses TypeScript type checking instead of fixing the underlying type issue. The ray_dashboard_url property is properly defined as optional in the QuerySummary type. Consider properly typing the accessor or using @ts-expect-error with a specific explanation if this is a known limitation of the column helper library.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/daft-dashboard/frontend/src/app/queries/page.tsx
Line: 123:124

Comment:
[P2] Using `@ts-ignore` suppresses TypeScript type checking instead of fixing the underlying type issue. The `ray_dashboard_url` property is properly defined as optional in the `QuerySummary` type. Consider properly typing the accessor or using `@ts-expect-error` with a specific explanation if this is a known limitation of the column helper library.

How can I resolve this? If you propose a fix, please make it concise.

import daft.datasets as datasets
import daft.functions as functions
import daft.gravitino as gravitino
# import daft.gravitino as gravitino
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] The gravitino import is commented out but this change isn't mentioned in the PR description. This appears to be an unrelated change. If gravitino support is being removed or temporarily disabled, it should be documented in the PR description or done in a separate commit.

Prompt To Fix With AI
This is a comment left during a code review.
Path: daft/__init__.py
Line: 155:155

Comment:
[P2] The gravitino import is commented out but this change isn't mentioned in the PR description. This appears to be an unrelated change. If gravitino support is being removed or temporarily disabled, it should be documented in the PR description or done in a separate commit.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant