-
Notifications
You must be signed in to change notification settings - Fork 39
Description
Hello,
I am attempting to perform Baysor segmentation (
The dataset is quite large, with the following specifics:
-
Transcript Count:
$123,749,421$ -
Gene Count:
$5101$ - Panel: 5k-gene panel
I am running this analysis on a Linux platform with
Despite the substantial resources, I am encountering extremely high memory usage and the runtime is prohibitively long
I am using the following general command structure and configuration parameters:
> JULIA_NUM_THREADS=80 bin/baysor run \
> -c xenium.toml \
> output/transcripts_uncompressed.csv \
> :cell_id \
> -o path/to/test_output \
> --plot
and the toml
[data]
x = "x_location"
y = "y_location"
z = "z_location"
gene = "feature_name"
min_molecules_per_gene = 10
exclude_genes = "NegControl*,BLANK_,antisense_,DeprecatedCodeword_,UnassignedCodeword_,Intergenic_*"
min_molecules_per_cell = 50[segmentation]
unassigned_prior_label = "UNASSIGNED"
prior_segmentation_confidence = 0.5[plotting]
min_pixels_per_cell = 10
Here's what I got.
[23:28:51] Info: Loaded 123749421 transcripts, 5101 genes.
[23:39:11] Info: Estimating noise level
[01:44:44] Info: Done
[03:04:48] Info: Clustering molecules...
Progress: 0%|▏ | ETA: 121.55 days
Iteration: 1023
Max. difference: 0.0346
Fraction of probs changed: 0.0181
My Question
I have been running the process for 12 hours, but I haven't seen any significant progress. I suspect there might be an error in my setup or parameters. Do you have any suggestions for reducing this extremely long wait time? Any feedback is greatly appreciated.