Skip to content

Commit efeaa22

Browse files
committed
Update tutorials to use expression syntax.
Missed some parts.
1 parent c8c893b commit efeaa22

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed

docs/coming_from_polars.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -290,7 +290,7 @@ We implicitly create a `Count` variable as the result of grouping by an aggregat
290290
```haskell
291291
let decade = (*10) . flip div 10 . year
292292
df_csv
293-
|> D.derive "decade" decade "birthdate"
293+
|> D.derive "decade" (lift decade (col @date "birthdate"))
294294
|> D.select ["decade"]
295295
|> D.groupByAgg D.Count ["decade"]
296296
```

docs/exploratory_data_analysis_primer.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,7 @@ In the housing dataset it'll tell how "typical" our typical home price is.
101101
ghci> import Data.Maybe
102102
ghci> m = fromMaybe 0 $ D.mean "median_house_value" df
103103
206855.81690891474
104-
ghci> df |> D.derive "deviation" (\v -> abs (v - m)) "median_house_value" |> D.select ["median_house_value", "deviation"] |> D.take 10
104+
ghci> df |> D.derive "deviation" (D.col "median_house_value" - D.lit m) |> D.select ["median_house_value", "deviation"] |> D.take 10
105105
-----------------------------------------------
106106
index | median_house_value | deviation
107107
------|--------------------|-------------------
@@ -119,12 +119,12 @@ index | median_house_value | deviation
119119
9 | 261100.0 | 54244.18309108526
120120
```
121121

122-
Read left to right, we begin by calling `derive` which applies a function to a given column and stores the result in a target column. The order of arguments is `derive <target column> <function> <deriving column> <dataframe>`. We then select only the two columns we want and take the first 10 rows.
122+
Read left to right, we begin by calling `derive` which creates a new column computed from a given expression. The order of arguments is `derive <target column> <expression> <dataframe>`. We then select only the two columns we want and take the first 10 rows.
123123

124124
This gives us a list of the deviations. From the small sample it does seem like there are some wild deviations. The first one is greater than the mean! How typical is this? Well to answer that we take the average of all these values.
125125

126126
```haskell
127-
ghci> withDeviation = df |> D.derive "deviation" (\v -> abs (v - m)) "median_house_value" |> D.select ["median_house_value", "deviation"]
127+
ghci> withDeviation = df |> D.derive "deviation" (D.col "median_house_value" - D.lit m) |> "median_house_value" |> D.select ["median_house_value", "deviation"]
128128
ghci> D.mean "deviation" withDeviation
129129
Just 91170.43994367732
130130
```

0 commit comments

Comments
 (0)