Skip to content

test(benchmarks): add basic, local benchmark suite #173

Open
@deepyaman

Description

@deepyaman

Is local ML preprocessing with DuckDB faster than ML preprocessing with scikit-learn? How does one-hot encoding with IbisML on Snowflake compare to snowflake.ml.modeling.preprocessing.OneHotEncoder? Can ML preprocessing on the database outperform ML preprocessing in Ray Data? Should we have been training our deep learning models in T-SQL all along?

Let's start by looking at benchmarks at various data volumes and numbers of preprocessing steps locally. The purpose of this is to mostly understand the workflows wherein IbisML can provide value; it is not, for instance, to say that people shouldn't use scikit-learn for a lot of local ML pipelines.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    • Status

      backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions