Skip to content

PSRN Integration Slowing Down Population Evolution #23

@x66ccff

Description

@x66ccff

Hi @MilesCranmer,

I've switched back to the Reactant branch because I discovered that the PythonCall branch doesn't coordinate well with Julia when launching multiple threads (Specifically, it freezes when launched with multiple threads, whereas it works fine with a single thread.), even after applying changes according to their documentation at https://juliapy.github.io/PythonCall.jl/stable/pythoncall/#jl-multi-threading.

I believe we should continue with Reactant.jl as it seems easier for users to install. Both PythonCall and Tharray.jl introduce complications for users.

Current Implementation

Repository: https://github.com/x66ccff/SymbolicRegressionGPU.jl/tree/reactant.jl-no-stuck-cpu

Performance Bottleneck

I've been analyzing the performance bottlenecks in the Reactant.jl branch and have a question about one issue:

Currently, I'm having one population in the genetic algorithm call PSRN:

if options.populations > 3 # TODO I don' know how to add a option for control whether use PSRN or not, cause Option too complex for me ...
start_psrn_task(
psrn_manager, dominating_trees, dataset, options, N_PSRN_INPUT, n_variables
)
process_psrn_results!(
psrn_manager, state.halls_of_fame[j], dataset, options
)
end

This approach seems to cause other populations to wait for the PSRN-calling population to finish before continuing their iterations. This significantly slows down overall performance. I've been trying to modify this, but I'm not very familiar with Julia and don't know how to fix it properly.

Maybe we need to create a dedicated thread for PSRN instead of using one of the population threads? Do you know how to modify this? I'd appreciate specific code-level assistance.

Running the Example

You can run the example as follows (you may need to reduce the input size by a factor of 3 to run on a 24GB GPU):

println("Use PSRN")
# N_PSRN_INPUT = 15
# N_PSRN_INPUT = 5 # TODO this can be tuned
N_PSRN_INPUT = 5 # TODO this can be tuned

export XLA_REACTANT_GPU_MEM_FRACTION=0.99
julia -t 16 --project=. example3.jl

For comparison with vanilla SymbolicRegression.jl, run:

julia -t 16 example3.jl

You'll observe that using PSRN significantly slows down the population evolution speed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions