Hi @MilesCranmer,
I've switched back to the Reactant branch because I discovered that the PythonCall branch doesn't coordinate well with Julia when launching multiple threads (Specifically, it freezes when launched with multiple threads, whereas it works fine with a single thread.), even after applying changes according to their documentation at https://juliapy.github.io/PythonCall.jl/stable/pythoncall/#jl-multi-threading.
I believe we should continue with Reactant.jl as it seems easier for users to install. Both PythonCall and Tharray.jl introduce complications for users.
Current Implementation
Repository: https://github.com/x66ccff/SymbolicRegressionGPU.jl/tree/reactant.jl-no-stuck-cpu
Performance Bottleneck
I've been analyzing the performance bottlenecks in the Reactant.jl branch and have a question about one issue:
Currently, I'm having one population in the genetic algorithm call PSRN:
|
if options.populations > 3 # TODO I don' know how to add a option for control whether use PSRN or not, cause Option too complex for me ... |
|
start_psrn_task( |
|
psrn_manager, dominating_trees, dataset, options, N_PSRN_INPUT, n_variables |
|
) |
|
process_psrn_results!( |
|
psrn_manager, state.halls_of_fame[j], dataset, options |
|
) |
|
end |
This approach seems to cause other populations to wait for the PSRN-calling population to finish before continuing their iterations. This significantly slows down overall performance. I've been trying to modify this, but I'm not very familiar with Julia and don't know how to fix it properly.
Maybe we need to create a dedicated thread for PSRN instead of using one of the population threads? Do you know how to modify this? I'd appreciate specific code-level assistance.
Running the Example
You can run the example as follows (you may need to reduce the input size by a factor of 3 to run on a 24GB GPU):
|
println("Use PSRN") |
|
# N_PSRN_INPUT = 15 |
|
# N_PSRN_INPUT = 5 # TODO this can be tuned |
|
N_PSRN_INPUT = 5 # TODO this can be tuned |
|
|
|
|
export XLA_REACTANT_GPU_MEM_FRACTION=0.99
julia -t 16 --project=. example3.jl
For comparison with vanilla SymbolicRegression.jl, run:
You'll observe that using PSRN significantly slows down the population evolution speed.
Hi @MilesCranmer,
I've switched back to the Reactant branch because I discovered that the PythonCall branch doesn't coordinate well with Julia when launching multiple threads (Specifically, it freezes when launched with multiple threads, whereas it works fine with a single thread.), even after applying changes according to their documentation at https://juliapy.github.io/PythonCall.jl/stable/pythoncall/#jl-multi-threading.
I believe we should continue with Reactant.jl as it seems easier for users to install. Both PythonCall and Tharray.jl introduce complications for users.
Current Implementation
Repository: https://github.com/x66ccff/SymbolicRegressionGPU.jl/tree/reactant.jl-no-stuck-cpu
Performance Bottleneck
I've been analyzing the performance bottlenecks in the Reactant.jl branch and have a question about one issue:
Currently, I'm having one population in the genetic algorithm call PSRN:
SymbolicRegressionGPU.jl/src/SymbolicRegression.jl
Lines 1429 to 1436 in c5bff43
This approach seems to cause other populations to wait for the PSRN-calling population to finish before continuing their iterations. This significantly slows down overall performance. I've been trying to modify this, but I'm not very familiar with Julia and don't know how to fix it properly.
Maybe we need to create a dedicated thread for PSRN instead of using one of the population threads? Do you know how to modify this? I'd appreciate specific code-level assistance.
Running the Example
You can run the example as follows (you may need to reduce the input size by a factor of 3 to run on a 24GB GPU):
SymbolicRegressionGPU.jl/src/SymbolicRegression.jl
Lines 1331 to 1336 in c5bff43
export XLA_REACTANT_GPU_MEM_FRACTION=0.99 julia -t 16 --project=. example3.jlFor comparison with vanilla SymbolicRegression.jl, run:
You'll observe that using PSRN significantly slows down the population evolution speed.