Add mode()
aggregation function to find most common unique value. #5864
Open
Description
The statistical mode is a very common metric for data and very useful with nominal/categorical data. For example, say a table is of days with categories ["Sunny", "Cloudy", "Rainy"]
-- knowing most days are "Rainy"
is more useful than using unique()
and knowing that the weather can be those categories.
A rough implementation of one can be seen below. The reason for the weird Array.from(counts).find(...)
is that most implementations of mode
from what I can tell tend to prefer finding the first element with the maximum count when there is a tie. I am open to other implementations if this isn't a concern.
const mode: AggregationFn<any> = (columnId, leafRows) => {
if (!leafRows.length) {
return
}
let maxCount = 0
const counts = leafRows.reduce((counts, row) => {
const value = row.getValue(columnId)
const valueCount = (counts.get(value) ?? 0) + 1
maxCount = Math.max(maxCount, valueCount)
return counts.set(value, valueCount)
}, new Map<unknown, number>())
return Array.from(counts).find(([, count]) => count === maxCount)![0]
}
And here is a StackBlitz implementation showing it passes unit tests.
Metadata
Assignees
Labels
No labels