Skip to content

Feature scaling considerations #459

@bfhealy

Description

@bfhealy

During both training and inference, SCoPe normalizes all features to a range [0, 1] based on the min/max of that feature's distribution in the training set. This consistently between training and inference is very important, although since the features in the training set do not necessarily contain the same min/max as those in an arbitrary ZTF field, it means that inference sources may have their features scaled to a range other than [0, 1]. It would be ideal (but likely impractical) to compute features for all ZTF sources before using those min/max values to normalize features for training/inference.

Another scaling option is to normalize a field's features based on the min/max values of that specific field's feature distributions (rather than the training set). However, while the [0, 1] range would then always be enforced, this would create an inconsistency between what a scaled feature value of 0.5 means during training vs. during inference. That inconsistency would seem to be more detrimental to classification than our current approach, which only suffers when features fall outside the range that the classifier trained on.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions