Skip to content

Release 0.3.0

Latest

Choose a tag to compare

@luigilcsilva luigilcsilva released this 13 Feb 22:03
e7d0b52
  • Fix densmap scalability for large catalogs by replacing dense per-partition aggregation with sparse histogram reduction in a bounded fan-in tree, preventing oversized gather tasks at high depths.
  • Compute only the finest densmap from source data and derive lower orders by exact NESTED parent-child aggregation (4 children -> 1 parent), reducing repeated catalog scans; keep per-depth progress logs (Computing/Derived/Wrote densmap_o*.fits).
  • Optimize score_density_hybrid stage-1 per-tile top-k with an exact two-stage strategy (local prune + global reduce), reducing shuffle volume and improving runtime on large catalogs.
  • Add score_density_hybrid.density_up_to_depth (default 4) to control how far stage-1 density selection runs before switching to score-based stage-2.
  • Update output TSV column ordering semantics for columns.keep: preserve original input order when omitted/null; honor explicit keep order when complete; otherwise prepend missing required columns (with RA/DEC first when absent); and keep RA/DEC first when keep=[].
  • Make stage-2 depth writing (no Allsky) streaming-based with bucketed temporary fragments, avoiding depth_ddf.compute() materialization on the driver and reducing distributed-filesystem metadata pressure.
  • Run stage-2 bucket processing on distributed workers (Client.submit) so compute/IO stay on workers and the driver remains orchestration-only.
  • Require an active dask.distributed client for streamed stage-2 writes; fail fast when absent instead of silently degrading to local execution.
  • Auto-tune merge fan-in per worker task using RLIMIT_NOFILE and worker concurrency, and bound fan-in rounds to prevent EMFILE (Too many open files) during high-depth bucket merges.
  • Keep stage-2 k-way merge on a single bounded fan-in safety path, simplifying behavior while preserving robustness under high fragment fan-out.
  • Reuse selection-stage per-depth write stats for final output counts (telemetry/properties) and remove slow full-TSV recount fallback; pipeline now fails fast if required intermediate stats are missing/invalid.
  • Add startup observability logs for cluster runtime (local/SLURM resources + directives) and stage-2 streaming execution (worker count, bucket count, fan-in reduction summary).
  • Fix distributed compatibility warning by reading worker concurrency from Worker.state.nthreads (with fallback for older versions), avoiding FutureWarning on new distributed.
  • Remove pandas FutureWarning in local top-k pruning by avoiding partition-level DataFrameGroupBy.apply.
  • Detailed run benchmarks for these optimizations are tracked in:
    • benchmarks/records/2026-02-10_des_dr2_score_density_hybrid_topk_two_stage.md
    • benchmarks/records/2026-02-10_des_dr2_densmaps_finest_derive.md
    • benchmarks/records/2026-02-12_des_dr2_score_density_hybrid_dask_workers_fanin.md