Skip to content
Draft
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
128 changes: 126 additions & 2 deletions docs/source/polars-cloud/run/query-profile.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,98 @@ orders = pl.scan_parquet(
)
```

{{code_block('polars-cloud/query-profile','execute',[])}}

</details>

{{code_block('polars-cloud/query-profile','execute',[])}}
<!-- Execute query -->

## Polars Cloud Query Profiler

Polars Cloud has a built-in query profiler. It shows realtime status of the query during and after
execution, and gives you detailed metrics to the node level. This can help you find and analyze
bottlenecks, helping you to run your queries optimally.

It can be accessed from the Cluster Dashboard.

### Cluster Dashboard

The cluster dashboard gives you insights into:

- system metrics (CPU, memory, and network) of all nodes on your cluster.
- an overview of the queries that are related to this cluster, scheduled, running, and finished.

![Cluster Dashboard](https://raw.githubusercontent.com/pola-rs/polars-static/refs/heads/master/docs/query-profiler/cluster_dashboard.png)

You can get into the cluster dashboard through the pop-ups on the Polars Cloud dashboard after
starting a compute cluster, or by going to the details page of your compute.

![Compute Details page](https://raw.githubusercontent.com/pola-rs/polars-static/refs/heads/master/docs/query-profiler/compute_dashboard.png)

This dashboard runs from the compute that you're running your queries on. It becomes available the
moment your compute has started and is no longer available after your cluster shuts down.

The system resources allow you to find bottlenecks and tweak your cluster configuration accordingly.

- In case the CPU resources max out, you can add CPUs.
- In case your memory maxes out, you can add memory.
- In case your network bandwidth maxes out, you can add more nodes.

### Query Details

When you select a query from the cluster dashboard you open the details. An overview opens that
displays the general metrics of that query.

![Query Details](https://raw.githubusercontent.com/pola-rs/polars-static/refs/heads/master/docs/query-profiler/query_details.png)

From here you can dive deeper into different aspects of the query. The first one we'll explore is
the logical plan.

### Logical Plan

In Polars, a logical plan is the intermediate representation (IR) of a query that describes what
operations to perform, before physical execution details are decided. This shows the graph that is a
representation of the query you sent to Polars Cloud.

![Logical Plan](https://raw.githubusercontent.com/pola-rs/polars-static/refs/heads/master/docs/query-profiler/logical_plan.png)

<!--What can you do with this?-->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can think of the logical plan as the EXPLAIN in SQL. It describes what steps are performed and in with order, independent of physical engine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it directly a graph presentation of the Polars DSL, or is there already some processing done on it?


### Stage Graph

The stage graph represents the different phases in which the plan is executed on the distributed
cluster.

From the overview with the stage graph you can click the stage itself, opening the stage graph
details.

![Stage graph details](https://raw.githubusercontent.com/pola-rs/polars-static/refs/heads/master/docs/query-profiler/stage_graph_stage_details.png)

Alternatively, you can click one of the nodes in any stage to open up its details.

![Stage graph node details](https://raw.githubusercontent.com/pola-rs/polars-static/refs/heads/master/docs/query-profiler/stage_graph_node_details.png)

<!-- what is the exact definition of a stage?-->

When executing a query single node, this is not available.

### Physical Plan

The physical plan shows the strategy that was used to execute the query.

![Physical plan](https://raw.githubusercontent.com/pola-rs/polars-static/refs/heads/master/docs/query-profiler/physical_plan.png)

In it you can find the time spent per node, identifying choke points. Additionally some nodes are
marked with warnings that they're memory intensive. In the details pane you can find specific
metrics on how many rows went in and out, what the morsel sizes were and how many went through, and
more.

## Profile with the Polars Cloud SDK

Besides the query profiler in the cluster dashboard, you can also get diagnostic information through
the Polars Cloud SDK.

### `QueryProfile` and `QueryResult`

The `await_profile` method can be used to monitor an in-progress query. It returns a QueryProfile
object containing a DataFrame with information about which stages are being processed across
Expand Down Expand Up @@ -105,7 +194,7 @@ As each worker starts and completes each stage of the query, it notifies the lea
`await_profile` method will poll the lead worker until there is an update from any worker, and then
return the full profile data of the query.

The QueryProfile object also has a summary property to return an aggregated view of each stage.
The `QueryProfile` object also has a `summary` property to return an aggregated view of each stage.

{{code_block('polars-cloud/query-profile','await_summary',[])}}

Expand All @@ -129,3 +218,38 @@ shape: (13, 6)
│ 7 ┆ Execute IR ┆ true ┆ i-xxx ┆ 356662µs ┆ 1131041 ┆ 289546496 ┆ 0 │
└──────────────┴──────────────┴───────────┴────────────┴──────────────┴─────────────┴───────────────────────┴────────────────────┘
```

### Plan
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really subpar from the dashboard. If we add this to the user-guide, I think it should be somewhere else.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree. The interface is way more detailed and practical to explore. I already put this in its own section to separate it. But since it is Polars Cloud SDK functionality for query profiling, I wouldn't know where else to put it?
If you want to leave it out completely for now, let me know.


`QueryProfile` also exposes `.plan()` to retrieve the physical plan as a string, and `.graph()` to
render it as a visual diagram. See [Explain](#explain) below for details.

Use `.plan()` to retrieve the executed query plan as a string. This is useful for understanding
exactly how Polars executed your query, including the physical stages and operations performed
across the cluster.

{{code_block('polars-cloud/query-profile','explain',['QueryResult'])}}

```text
# TODO: add example output
```

You can also retrieve the optimized intermediate representation (IR) of the query before execution
by passing `"ir"` as the plan type.

{{code_block('polars-cloud/query-profile','explain_ir',['QueryResult'])}}

```text
# TODO: add example output
```

```Graph
Both `plan()` and `graph()` are available on `QueryResult` (with `plan_type` set to `"physical"` or
`"ir"`) and on `QueryProfile` (physical plan only). These methods are only available in direct mode.

Use `.graph()` to render the plan as a visual dot diagram using matplotlib.

{{code_block('polars-cloud/query-profile','graph',['QueryResult'])}}

<!-- TODO: Image of graph output -->
```
12 changes: 12 additions & 0 deletions docs/source/src/python/polars-cloud/query-profile.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,16 @@
# --8<-- [start:await_summary]
result.await_profile().summary
# --8<-- [end:await_summary]

# --8<-- [start:explain]
result.await_result().plan()
# --8<-- [end:explain]

# --8<-- [start:explain_ir]
result.await_result().plan("ir")
# --8<-- [end:explain_ir]

# --8<-- [start:graph]
result.await_result().graph()
# --8<-- [end:graph]
"""