Skip to content

Add mode() aggregation function to find most common unique value. #5864

Open
@jeremy-code

Description

The statistical mode is a very common metric for data and very useful with nominal/categorical data. For example, say a table is of days with categories ["Sunny", "Cloudy", "Rainy"] -- knowing most days are "Rainy" is more useful than using unique() and knowing that the weather can be those categories.

A rough implementation of one can be seen below. The reason for the weird Array.from(counts).find(...) is that most implementations of mode from what I can tell tend to prefer finding the first element with the maximum count when there is a tie. I am open to other implementations if this isn't a concern.

const mode: AggregationFn<any> = (columnId, leafRows) => {
  if (!leafRows.length) {
    return
  }

  let maxCount = 0
  const counts = leafRows.reduce((counts, row) => {
    const value = row.getValue(columnId)
    const valueCount = (counts.get(value) ?? 0) + 1
    maxCount = Math.max(maxCount, valueCount)
    return counts.set(value, valueCount)
  }, new Map<unknown, number>())

  return Array.from(counts).find(([, count]) => count === maxCount)![0]
}

And here is a StackBlitz implementation showing it passes unit tests.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions