Skip to content

Commit 44677fa

Browse files
committed
Update thresholds.qmd
1 parent dacba6b commit 44677fa

File tree

1 file changed

+32
-22
lines changed

1 file changed

+32
-22
lines changed

docs/get-started/thresholds.qmd

+32-22
Original file line numberDiff line numberDiff line change
@@ -5,34 +5,39 @@ html-table-processing: none
55
---
66

77
::::: {.callout}
8-
This is a work in progress. It's just an outline for now.
8+
This is a work in progress. And some of this article is just an outline for now.
99
:::
1010

11-
Thresholds enable you to signal failure at different severity levels. In the near future, thresholds will be able to trigger custom actions. For example, when testing a column for NULLs with `col_vals_not_null()` you might want to warn on any NULLs and stop where there are 10% NULLs in the column.
11+
Thresholds enable you to signal failure at different severity levels. In the near future, thresholds
12+
will be able to trigger custom actions. For example, when testing a column for Null/missing values
13+
with `col_vals_not_null()` you might want to warn on any missing values and stop where there are
14+
10% missing values in the column.
1215

1316
```{python}
1417
import pointblank as pb
1518
1619
validation_1 = (
1720
pb.Validate(data=pb.load_dataset(dataset="small_table"))
18-
.col_vals_not_null(columns="a", thresholds=(1, 0.1))
21+
.col_vals_not_null(columns="c", thresholds=(1, 0.1))
1922
.interrogate()
2023
)
2124
2225
validation_1
2326
```
2427

25-
The code uses `thresholds=(1, 0.1)` to set a `WARN` threshold of 1 and a `STOP` threshold of 10% failing test units. Notice these pieces in the validation table:
28+
The code uses `thresholds=(1, 0.1)` to set a `warn` threshold of `1` and a `stop` threshold of `0.1`
29+
(which is 10%) failing test units. Notice these pieces in the validation table:
2630

27-
- The `FAIL` column shows that x tests units have failed
28-
- The `W` column (short for `WARN`) shows a filled yellow circle indicating it's reached threshold
29-
- The `S` column (short for `STOP`) shows an open red circle indicating it's below threshold
31+
- The `FAIL` column shows that 2 tests units have failed
32+
- The `W` column (short for `warn`) shows a filled yellow circle indicating it's reached threshold
33+
- The `S` column (`stop`) shows an open red circle indicating it's below threshold
3034

31-
The one final threshold, `N` (`NOTIFY`), wasn't set so appears on the validation table as a dash.
35+
The one final threshold, `N` (`notify`), wasn't set so appears on the validation table as a dash.
3236

33-
## Using the `Validation(threshold=)` argument
37+
## Using the `Validation(threshold=)` Argument
3438

35-
We can also define thresholds globally. This means that every validation step will re-use the same set of threshold values.
39+
We can also define thresholds globally. This means that every validation step will re-use the same
40+
set of threshold values.
3641

3742
```python
3843
import pointblank as pb
@@ -47,16 +52,21 @@ validation_2 = (
4752
validation_2
4853
```
4954

50-
In this, both the `col_vals_not_null()` and `col_vals_gt()` steps will use the `thresholds=` value set in the `Validate()` call. Now, if you want to override these global threshold values for a given validation step, you can always use `threshold=` argument when calling a validation method.
55+
In this, both the `col_vals_not_null()` and `col_vals_gt()` steps will use the `thresholds=` value
56+
set in the `Validate()` call. Now, if you want to override these global threshold values for a given
57+
validation step, you can always use the `threshold=` argument when calling a validation method (the
58+
argument is present in every validation method).
5159

52-
## Defining Thresholds
60+
## Ways to Define Thresholds
5361

54-
### Threshold shorthands
5562

56-
The fastest way to define a threshold is to use a tuple with entries for `warn`, `stop`, and `nofify` levels.
63+
### Using a Tuple
64+
65+
The fastest way to define a threshold is to use a tuple with entries for `warn`, `stop`, and
66+
`nofify` levels.
5767

5868
```python
59-
# [WARN, STOP, NOTIFY]
69+
# (warn, stop, notify)
6070
threshold = (1, 2, 3)
6171

6272
Validate(data=..., threshold=threshold)
@@ -67,17 +77,19 @@ Note that a shorter tuple or even single values are also allowed:
6777
- `(1, 2)`: `warn` state at 1 failing test unit, `stop` state at 2 failing test units
6878
- `1` or `(1, )`: `warn` state at 1 failing test unit
6979

70-
### Threshold cutoff values
80+
### The `Threshold` Class
81+
82+
83+
## Threshold Cutoff Values
7184

7285
Threshold values can be specified in two ways:
7386

7487
- percentage: a decimal value like 0.1 to mean 10% test units failed
7588
- number: a fixed number of test units failed
7689

77-
Threshold cutoffs are inclusive so any value of failing test units greater than or equal to the cutoff will result in triggering the threshold. So if a threshold is defined with a cutoff value of `5`, then 5 failing test units will result in threshold.
78-
79-
### The `Threshold` class
80-
90+
Threshold cutoffs are inclusive so any value of failing test units greater than or equal to the
91+
cutoff will result in triggering the threshold. So if a threshold is defined with a cutoff value of
92+
`5`, then 5 failing test units will result in threshold.
8193

8294
## Triggering Actions
8395

@@ -90,5 +102,3 @@ This is not currently implemented.
90102
## Use Case: Global tolerance bands
91103

92104

93-
## Use Case: Schema Correctness After Table Joins
94-

0 commit comments

Comments
 (0)