Variable Sparsity Penalty #997

gm89uk · 2025-07-26T20:07:36Z

gm89uk
Jul 26, 2025

I'd like to propose a new penalty to encourage variable sparsity.

I have repeatedly found that searches with fewer input variables have equal or lower loss than when I do extended searches with more variables. Ideally, if two equations at equal complexities have equal loss, the algorithm should, in most instances, by default, prioritise the equation with fewer input variables.

I know this can be achieved through a custom loss function, but it would be cool to have as a modifiable input parameter; penalty for each additional unique feature used in each equation. This would be similar to the existing parsimony parameter, but instead of penalising expression complexity, it would penalise the number of unique variables used.

This would act as a form of L0 regularisation on the features, helping the search find expressions that are not only simple in structure but also rely on a minimal set of inputs.

From what I understand, this is different to complexity_of_variables which does not address number of unique variables. Maybe this would slightly help SR in higher dimensional problems?

MilesCranmer · 2025-07-26T23:05:36Z

MilesCranmer
Jul 26, 2025
Maintainer

I think this might be a good use case for complexity_mapping? What do you think:

function variable_sparsity_complexity(expression)
    tree = get_tree(expression)  # (for template expressions, would need to do something more complex)
    num_nodes = Ref(0)
    unique_features = Set{Int}()
    foreach(tree) do node  # (equivalent to`for node in tree`, but is slightly faster)
        num_nodes[] += 1
        if node.degree == 0 && !node.constant
            push!(unique_features, node.feature)
        end
    end
    # complexity is normal complexity + number of unique features used
    return num_nodes[] + length(unique_features)
end

options_sparse = Options(;  # or SRRegressor
    binary_operators=[+, -, *],
    complexity_mapping=variable_sparsity_complexity,
)

1 reply

gm89uk Jul 26, 2025
Author

Thanks Miles, I think that's a much elegant solution, to increase complexity rather than penalise loss, which is more intuitive and will not have scaling issues, like parsimony does.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Variable Sparsity Penalty #997

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Variable Sparsity Penalty #997

Uh oh!

Uh oh!

gm89uk Jul 26, 2025

Replies: 1 comment · 1 reply

Uh oh!

MilesCranmer Jul 26, 2025 Maintainer

Uh oh!

gm89uk Jul 26, 2025 Author

gm89uk
Jul 26, 2025

Replies: 1 comment 1 reply

MilesCranmer
Jul 26, 2025
Maintainer

gm89uk Jul 26, 2025
Author