Skip to content

Commit 71bc460

Browse files
committed
mention duplicates, not cardinality
1 parent 1fb0f8c commit 71bc460

1 file changed

Lines changed: 4 additions & 4 deletions

File tree

src/materialized/dependencies.rs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,13 +26,13 @@ The dependency analysis in a nutshell involves analyzing the fragment of the mat
2626
partition columns (or row metadata columns more generally). This logical fragment is then used to generate a dependency graph between physical partitions
2727
of the materialized view and its source tables. This gives rise to two natural phases of the algorithm:
2828
1. **Inexact Projection Pushdown**: We aggressively prune the logical plan to only include partition columns (or row metadata columns more generally) of the materialized view and its sources.
29-
This is similar to pushing down a top-level projection on the materialized view's partition columns. However, "inexact" means that we do not preserve cardinality, order,
29+
This is similar to pushing down a top-level projection on the materialized view's partition columns. However, "inexact" means that we do not preserve duplicates, order,
3030
or even set equality of the original query.
3131
* Formally, let P be the (exact) projection operator. If A is the original plan and A' is the result of "inexact" projection pushdown, we have PA ⊆ A'.
3232
* This means that in the final output, we may have dependencies that do not exist in the original query. However, we will never miss any dependencies.
3333
2. **Dependency Graph Construction**: Once we have the pruned logical plan, we can construct a dependency graph between the physical partitions of the materialized view and its sources.
3434
After step 1, every table scan only contains row metadata columns, so we replace the table scan with an equivalent scan to a [`RowMetadataSource`](super::row_metadata::RowMetadataSource)
35-
This operation also is not cardinality or order preserving. Then, additional metadata is "pushed up" through the plan to the root, where it can be unnested to give a list of source files for each output row.
35+
This operation also is not duplicate or order preserving. Then, additional metadata is "pushed up" through the plan to the root, where it can be unnested to give a list of source files for each output row.
3636
The output rows are then transformed into object storage paths to generate the final graph.
3737
3838
The transformation is complex, and we give a full walkthrough in the documentation for [`mv_dependencies_plan`].
@@ -325,10 +325,10 @@ fn get_table_name(args: &[Expr]) -> Result<&String> {
325325
/// C --> E["TableScan: daily_statistics (projection=[date])"]
326326
/// ```
327327
///
328-
/// Note that the `Aggregate` node was converted into a projection. This is valid because we do not need to preserve cardinality. However, it does imply that
328+
/// Note that the `Aggregate` node was converted into a projection. This is valid because we do not need to preserve duplicate rows. However, it does imply that
329329
/// we cannot partition the materialized view on aggregate expressions.
330330
///
331-
/// Now we substitute all scans with equivalent row metadata scans (up to cardinality), and push up the row metadata to the root of the plan,
331+
/// Now we substitute all scans with equivalent row metadata scans (up to addition or removal of duplicates), and push up the row metadata to the root of the plan,
332332
/// together with the target path constructed from the (static) partition columns. This gives us the following plan:
333333
///
334334
/// ```mermaid

0 commit comments

Comments
 (0)