PSRN Integration Slowing Down Population Evolution


Hi @MilesCranmer,

I've switched back to the Reactant branch because I discovered that the PythonCall branch doesn't coordinate well with Julia when launching multiple threads (Specifically, it freezes when launched with multiple threads, whereas it works fine with a single thread.), even after applying changes according to their documentation at https://juliapy.github.io/PythonCall.jl/stable/pythoncall/#jl-multi-threading.

I believe we should continue with Reactant.jl as it seems easier for users to install. Both PythonCall and Tharray.jl introduce complications for users.

## Current Implementation

Repository: https://github.com/x66ccff/SymbolicRegressionGPU.jl/tree/reactant.jl-no-stuck-cpu

## Performance Bottleneck

I've been analyzing the performance bottlenecks in the Reactant.jl branch and have a question about one issue:

Currently, I'm having one population in the genetic algorithm call PSRN:
https://github.com/x66ccff/SymbolicRegressionGPU.jl/blob/c5bff438e9f231753fc310696af965280e503c2e/src/SymbolicRegression.jl#L1429-L1436

This approach seems to cause other populations to wait for the PSRN-calling population to finish before continuing their iterations. This significantly slows down overall performance. I've been trying to modify this, but I'm not very familiar with Julia and don't know how to fix it properly. 

Maybe we need to create a dedicated thread for PSRN instead of using one of the population threads? Do you know how to modify this? I'd appreciate specific code-level assistance.

## Running the Example

You can run the example as follows (you may need to reduce the input size by a factor of 3 to run on a 24GB GPU):
https://github.com/x66ccff/SymbolicRegressionGPU.jl/blob/c5bff438e9f231753fc310696af965280e503c2e/src/SymbolicRegression.jl#L1331-L1336

```bash
export XLA_REACTANT_GPU_MEM_FRACTION=0.99
julia -t 16 --project=. example3.jl
```

For comparison with vanilla SymbolicRegression.jl, run:

```bash
julia -t 16 example3.jl
```

You'll observe that using PSRN significantly slows down the population evolution speed.

	if options.populations > 3 # TODO I don' know how to add a option for control whether use PSRN or not, cause Option too complex for me ...
	start_psrn_task(
	psrn_manager, dominating_trees, dataset, options, N_PSRN_INPUT, n_variables
	)
	process_psrn_results!(
	psrn_manager, state.halls_of_fame[j], dataset, options
	)
	end

	println("Use PSRN")
	# N_PSRN_INPUT = 15
	# N_PSRN_INPUT = 5 # TODO this can be tuned
	N_PSRN_INPUT = 5 # TODO this can be tuned

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PSRN Integration Slowing Down Population Evolution #23

Current Implementation

Performance Bottleneck

Running the Example

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

PSRN Integration Slowing Down Population Evolution #23

Description

Current Implementation

Performance Bottleneck

Running the Example

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions