-
|
I want to find outliers within each group: import ibis
from ibis import _
t = ibis.examples.penguins.fetch()
is_outlier = _.body_mass_g > _.body_mass_g.quantile(0.95).over(group_by=["sex", "species"])
t[is_outlier]but this causes I can get this working in multiple steps with t2 = t.group_by(["sex", "species"]).mutate(
is_outlier=_.body_mass_g > _.body_mass_g.quantile(0.95)
)
t2[_.is_outlier]but this just feels clunky, I don't want to have to create a second relation with the temp column floating around.
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
|
FWIW, dbplyr allows window functions in library(dplyr)
con <- DBI::dbConnect(duckdb::duckdb())
t <- copy_to(con, palmerpenguins::penguins)
t |>
filter(body_mass_g > mean(body_mass_g, na.rm = TRUE), .by = c(sex, species)) |>
show_query()
## <SQL>
## SELECT
## species,
## island,
## bill_length_mm,
## bill_depth_mm,
## flipper_length_mm,
## body_mass_g,
## sex,
## "year"
## FROM (
## SELECT *, AVG(body_mass_g) OVER (PARTITION BY sex, species) AS q02
## FROM "palmerpenguins::penguins"
## ) q01
## WHERE (body_mass_g > q02)
|
Beta Was this translation helpful? Give feedback.
-
|
I think we'll have a solution for this soon™️ sqlglot supports transforming From there, sqlglot will compile either to |
Beta Was this translation helpful? Give feedback.
I think we'll have a solution for this soon™️
sqlglot supports transforming
QUALIFYintoWHERE(this is only important for some backends), but we need a way in ibis to represent a predicate that should be compiled into aQUALIFYclause. We may be able to do this with an internal-onlyops.Value, or by partitioning predicates into a newqualifyfield onops.Selection.From there, sqlglot will compile either to
QUALIFYif the backend supports it, or the equivalent projection +WHEREsimilar to what dplyr is doing.