Skip to content

Use of java.util.Random to generate seed causes failure #9317

@MattWellie

Description

@MattWellie

ProcessOutput pythonProcessOutput = executor.executeScriptAndGetOutput(
new Resource(script, GermlineCNVCaller.class),
null,
composePythonArguments(intervalSubsetReadCountFiles, STARTING_SEED));
if (pythonProcessOutput.getExitValue() != 0) {
// We restart once if the inference diverged
if (pythonProcessOutput.getExitValue() == DIVERGED_INFERENCE_EXIT_CODE) {
final Random generator = new Random(STARTING_SEED);
final int nextGCNVSeed = generator.nextInt();
logger.info("The inference failed to converge and will be restarted once with a different random seed.");
pythonProcessOutput = executor.executeScriptAndGetOutput(
new Resource(script, GermlineCNVCaller.class),
null,
composePythonArguments(intervalSubsetReadCountFiles, nextGCNVSeed));
} else {
throw executor.getScriptException(executor.getExceptionMessageFromScriptError(pythonProcessOutput));
}
}

Based on the documentation here the use of Random.nextInt() returns any integer, so has a 50% chance of returning a negative value. Negative values are not acceptable arguments to the python script this process is preparing arguments for:

(from GATK-SV logs)

01:26:00.580 INFO  GermlineCNVCaller - The inference failed to converge and will be restarted once with a different random seed.
...
01:26:00.586 DEBUG ScriptExecutor -   /mnt/disks/cromwell_root/tmp.MOkT0J/case_denoising_calling.6807924686880120959.py
01:26:00.586 DEBUG ScriptExecutor -   --ploidy_calls_path=/mnt/disks/cromwell_root/contig-ploidy-calls-dir
01:26:00.586 DEBUG ScriptExecutor -   --output_calls_path=/mnt/disks/cromwell_root/out/case-calls
01:26:00.586 DEBUG ScriptExecutor -   --output_tracking_path=/mnt/disks/cromwell_root/out/case-tracking
01:26:00.586 DEBUG ScriptExecutor -   --input_model_path=/mnt/disks/cromwell_root/gcnv-model
01:26:00.586 DEBUG ScriptExecutor -   --random_seed=-1623339239
01:26:00.586 DEBUG ScriptExecutor -   --read_count_tsv_files
....

Traceback (most recent call last):
  File "/mnt/disks/cromwell_root/tmp.MOkT0J/case_denoising_calling.6807924686880120959.py", line 199, in <module>
    task = gcnvkernel.CaseDenoisingCallingTask(
  File "/opt/miniconda/envs/gatk/lib/python3.10/site-packages/gcnvkernel/tasks/task_case_denoising_calling.py", line 120, in __init__
    super().__init__(hybrid_inference_params, denoising_model, copy_number_emission_sampler, copy_number_caller,
  File "/opt/miniconda/envs/gatk/lib/python3.10/site-packages/gcnvkernel/tasks/inference_task_base.py", line 279, in __init__
    self.continuous_model_advi = ADVIDeterministicAnnealing(
  File "/opt/miniconda/envs/gatk/lib/python3.10/site-packages/gcnvkernel/inference/deterministic_annealing.py", line 47, in __init__
    approx = MeanField(local_rv=local_rv,
  File "/opt/miniconda/envs/gatk/lib/python3.10/site-packages/pymc/variational/approximations.py", line 338, in __init__
    groups = [self._group_class(None, *args, **kwargs)]
  File "/opt/miniconda/envs/gatk/lib/python3.10/site-packages/pymc/variational/opvi.py", line 746, in __init__
    self.rng = np.random.RandomState(random_seed)
  File "numpy/random/mtrand.pyx", line 185, in numpy.random.mtrand.RandomState.__init__
  File "_mt19937.pyx", line 168, in numpy.random._mt19937.MT19937._legacy_seeding
  File "_mt19937.pyx", line 182, in numpy.random._mt19937.MT19937._legacy_seeding
ValueError: Seed must be between 0 and 2**32 - 1
01:26:20.754 INFO  GermlineCNVCaller - Shutting down engine
[January 29, 2026 at 1:26:20 AM GMT] org.broadinstitute.hellbender.tools.copynumber.GermlineCNVCaller done. Elapsed time: 14.50 minutes.
Runtime.totalMemory()=5968494592
org.broadinstitute.hellbender.utils.python.PythonScriptExecutorException: 
python exited with 1

Swapping this out for Random.nextInt(int Bound) should give a guaranteed positive value. Or running an abs(input) on the input seed before it's passed through to the gcnvkernel module.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions