VS-1780 - Set sample_info.is_loaded for parquet ingest #9320

Copilot · 2026-02-02T18:33:31Z

Corrected spelling of 'saple_info' to 'sample_info'.

Suggested change

# in workflow processing. So this method is used to set *all* of the saple_info.is_loaded flags at one time.

# in workflow processing. So this method is used to set *all* of the sample_info.is_loaded flags at one time.

mcovarr · 2026-02-02T20:17:58Z

~~I don't understand this... a sample id is being compared to a partition id?~~

Ah ok... I tinkered with this query in the console and I think I see how this works. But I'm wondering if this is still going to return correct results for vet tables > 001? Wouldn't the partitions in vet_002 start at 1 again?

I actually copied the bit about the partition from the 'normal' SetIsLoadedColumn method - I had thought it had been put in there to avoid some of the weirdly named vet and ref_ranges tables that were created during foxtrot? Very possible I misunderstood that.

mcovarr · 2026-02-02T21:03:05Z

not sure I understand why this big OR block is here

shouldn't there be a ref_ranges version of the AND logic here?

The big OR logic I added (that starting here at 622) is trying to find out that there is a sample_name (as extricated from the file_path) in parquet_load_status for a vet parquet file creation and also one for a ref_ranges parquet file creation.

Copilot · 2026-02-02T18:33:31Z

The parameter name 'load_done' is ambiguous when typed as Array[String]. Consider renaming to 'load_done_statuses' or 'load_completion_markers' to clarify that it's a collection rather than a single completion signal.

-Original file line number
+Diff line change
@@ Expand Up / @@ -222,6 +222,7 @@ workflows: @@
                  - master
                  - ah_var_store
                  - VS-1736
+                 - gg_VS-1780
              tags:
                  - /.*/
        - name: GvsPrepareRangesCallset
@@ Expand Down Expand Up / @@ -290,7 +291,7 @@ workflows: @@
                  - master
                  - ah_var_store
                  - VS-1737
-                 - gg_VS-1785
+                 - gg_VS-1780
              tags:
                  - /.*/
        - name: GvsBeta
@@ Expand Down Expand Up / @@ -347,6 +348,7 @@ workflows: @@
                  - master
                  - ah_var_store
                  - vs_1777_build_failure
+                 - gg_VS-1780
              tags:
                  - /.*/
        - name: GvsIngestTieout
@@ Expand Down @@

-Original file line number
+Diff line change
@@ Expand Up / @@ -6,7 +6,7 @@ import "GvsCreateFilterSet.wdl" as CreateFilterSet @@
     import "GvsPrepareRangesCallset.wdl" as PrepareRangesCallset
     import "GvsExtractCallset.wdl" as ExtractCallset
     import "GvsUtils.wdl" as Utils
-    # 3
     workflow GvsJointVariantCalling {
         input {
             Boolean go = true
@@ Expand Down @@

-Original file line number
+Diff line change
@@ Expand Up / @@ -131,9 +131,9 @@ task GetToolVersions { @@
         # GVS generally uses the smallest `alpine` version of the Google Cloud SDK as it suffices for most tasks, but
         # there are a handful of tasks that require the larger GNU libc-based `slim`.
         String cloud_sdk_slim_docker = "gcr.io/google.com/cloudsdktool/cloud-sdk:524.0.0-slim"
-        String variants_docker = "us-central1-docker.pkg.dev/broad-dsde-methods/gvs/variants:2026-01-27-alpine-31607c946ac7"
+        String variants_docker = "us-central1-docker.pkg.dev/broad-dsde-methods/gvs/variants:2026-01-26-alpine-31607c946ac7"
         String variants_nirvana_docker = "us.gcr.io/broad-dsde-methods/variantstore:nirvana_2022_10_19"
-        String gatk_docker = "us-central1-docker.pkg.dev/broad-dsde-methods/gvs/gatk:2025-12-09-gatkbase-cda718c731d5"
+        String gatk_docker = "us-central1-docker.pkg.dev/broad-dsde-methods/gvs/gatk:2026-02-05-gatkbase-3c5808440553"
         String real_time_genomics_docker = "docker.io/realtimegenomics/rtg-tools:latest"
         String gotc_imputation_docker = "us.gcr.io/broad-gotc-prod/imputation-bcf-vcf:1.0.5-1.10.2-0.1.16-1649948623"
         String plink_docker = "us-central1-docker.pkg.dev/broad-dsde-methods/gvs/plink2:2024-04-23-slim-a0a65f52cc0e"
@@ Expand Down @@

-Original file line number
+Diff line change
@@ Expand Up @@
         public static Long getSampleId(final String sampleName, final File sampleMap) {
             Long sampleId = null;
-            //  Because BigQuery only supports partitioning based on timestamp or integer,
-            // sample names will be remapped into sample_id integers
+            // Because BigQuery only supports partitioning based on timestamp or integer,
+            // sample names will be remapped into sample_id integers.
             try {
                 BufferedReader br = new BufferedReader(new FileReader(sampleMap));
@@ Expand Down @@

-Original file line number
+Diff line change
@@ Expand Up @@
             try {
                 if (writeReferenceRanges) {
-                    final File refOutputFile = new File(outputDirectory, REF_RANGES_FILETYPE_PREFIX + tableNumber + PREFIX_SEPARATOR + sampleIdentifierForOutputFileName + "." + outputType.toString().toLowerCase());
+                    String[] sampleComponents = {tableNumber, sampleId.toString(), sampleIdentifierForOutputFileName};
+                    String filename = REF_RANGES_FILETYPE_PREFIX + String.join(PREFIX_SEPARATOR, sampleComponents) +
+                            "." + outputType.toString().toLowerCase();
+                    final File refOutputFile = new File(outputDirectory, filename);
                     switch (outputType) {
                         case BQ:
                             if (projectId == null || datasetName == null) {
@@ Expand Down @@

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VS-1780 - Set sample_info.is_loaded for parquet ingest #9320

Uh oh!

Diff view

Diff view

There are no files selected for viewing

Copilot AI Feb 2, 2026

Uh oh!

mcovarr Feb 2, 2026 •

edited

Loading

Uh oh!

mcovarr Feb 2, 2026 •

edited

Loading

Uh oh!

gbggrant Feb 3, 2026

Uh oh!

mcovarr Feb 2, 2026

Uh oh!

mcovarr Feb 2, 2026

Uh oh!

gbggrant Feb 3, 2026

Uh oh!

Copilot AI Feb 2, 2026

Uh oh!

VS-1780 - Set sample_info.is_loaded for parquet ingest #9320

Are you sure you want to change the base?

Uh oh!

VS-1780 - Set sample_info.is_loaded for parquet ingest #9320

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

mcovarr Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mcovarr Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gbggrant Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

mcovarr Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

mcovarr Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

gbggrant Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

mcovarr Feb 2, 2026 •

edited

Loading

mcovarr Feb 2, 2026 •

edited

Loading