Skip to content
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
fc0a25f
Upgrade picard, gatk, fgbio, snpeff to eliminate Java fat JAR CVEs
dpark01 Mar 20, 2026
b49b4b2
Add CVSS v4.0 support to Rego policy; filter DoS-only CVEs (Section 6)
dpark01 Mar 20, 2026
f6e2998
Replace GATK with FreeBayes for variant calling; eliminate GATK4 enti…
dpark01 Mar 20, 2026
3859a7f
Merge branch 'rego-v2-cvss4' into upgrade-java-tools
dpark01 Mar 20, 2026
0feb58f
Bump bioinformatics tool version floors to match actual resolved vers…
dpark01 Mar 20, 2026
b37cbe4
Fix FreeBayes VCF output: bgzip and tabix-index when .vcf.gz requested
dpark01 Mar 20, 2026
b3708fd
Bump classify and phylo tool version floors to match resolved versions
dpark01 Mar 20, 2026
8fed3e3
Relax hard-pinned package versions to allow minor updates
dpark01 Mar 20, 2026
fee61a1
Revert samtools/bcftools floors to fix ARM64 freebayes solver conflict
dpark01 Mar 20, 2026
0e3294d
Revert classify.txt version floors to fix ARM64 blast solver conflict
dpark01 Mar 20, 2026
d5d9daa
Switch refine_assembly tests from novoalign to minimap2
dpark01 Mar 20, 2026
27c0415
Remove stale novoalign import from test_assembly.py
dpark01 Mar 20, 2026
2726315
Update expected test outputs for FreeBayes+minimap2; skip novoalign w…
dpark01 Mar 20, 2026
241092c
Fix NameError: inline SKIP_X86_ONLY_REASON for TestGap2Seq
dpark01 Mar 20, 2026
8c48956
Address PR review comments: cleanup stale GATK references and temp files
dpark01 Mar 20, 2026
cafa55b
Re-add --skipRealign as deprecated no-op for WDL backward compatibility
dpark01 Mar 20, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
219 changes: 187 additions & 32 deletions .trivy-ignore-policy.rego
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@
# 2. Containers have no inbound network listeners (no ports exposed)
# 3. Containers run with dropped capabilities (no CAP_SYS_ADMIN, etc.)
# 4. Containers do not run in privileged mode
# 5. Pipeline inputs are data files (FASTQ, BAM, VCF, reference genomes),
# not attacker-controlled interactive input
# 5. Pipeline inputs are data files (FASTQ, BAM, VCF, reference genomes)
# which may be untrusted or malformed
#
# If any assumption does not hold for a given image or use case,
# DO NOT apply this policy to that image.
Expand All @@ -28,8 +28,15 @@
# accordingly. The CVSS vector string location has changed across
# Trivy versions (see https://github.com/aquasecurity/trivy/issues/1627).
#
# VERSION: 1.0
# LAST REVIEWED: 2026-03-19
# CVSS VERSION SUPPORT:
# This policy supports both CVSS v3.1 and CVSS v4.0 vector strings.
# Trivy is transitioning to v4.0 for newer advisories. Some CVEs may
# have only a v4.0 vector (no v3.1). The helper functions at the bottom
# extract vectors from both versions and the rules are written to match
# either format.
#
# VERSION: 2.0
# LAST REVIEWED: 2026-03-20
# REVIEW CADENCE: Quarterly, or when platform architecture changes
#

Expand All @@ -44,20 +51,20 @@ default ignore = false
# These CVEs require hands-on hardware interaction (USB, Firewire,
# JTAG, etc.) which is impossible in any cloud PaaS context.
#
# CVSS v3.1: AV:P
# CVSS v4.0: AV:P (same field name)
#
# Risk of false negative: Essentially zero. There is no scenario
# in which a pipeline container is physically accessible to an attacker.
# Confidence: Very High
###############################################################################

ignore {
cvss_vector := get_v3_vector(input)
contains(cvss_vector, "/AV:P/")
has_v3_field(input, "AV:P")
}

# Also catch AV:P at the end of the vector string (no trailing slash)
ignore {
cvss_vector := get_v3_vector(input)
endswith(cvss_vector, "/AV:P")
has_v4_field(input, "AV:P")
}

###############################################################################
Expand All @@ -69,19 +76,20 @@ ignore {
# on orchestrated infrastructure where the attacker cannot place
# themselves on an adjacent segment.
#
# CVSS v3.1: AV:A
# CVSS v4.0: AV:A (same field name)
#
# Risk of false negative: Very low. Cloud networking abstractions
# make adjacent-network attacks impractical against pipeline containers.
# Confidence: Very High
###############################################################################

ignore {
cvss_vector := get_v3_vector(input)
contains(cvss_vector, "/AV:A/")
has_v3_field(input, "AV:A")
}

ignore {
cvss_vector := get_v3_vector(input)
endswith(cvss_vector, "/AV:A")
has_v4_field(input, "AV:A")
}

###############################################################################
Expand All @@ -102,14 +110,32 @@ ignore {
# - AV:L + UI:R together means "must have local access AND a human
# must do something" - genuinely inapplicable in batch containers.
#
# CVSS v3.1: AV:L + UI:R
# CVSS v4.0: AV:L + UI:P (Passive) or UI:A (Active)
# v4.0 splits "Required" into Passive (viewing content) and Active
# (clicking/interacting). Both require a human, so both are safe to
# ignore in batch containers.
#
# Risk of false negative: Very low for true batch pipeline containers.
# Confidence: High
###############################################################################

# v3: AV:L + UI:R
ignore {
has_v3_field(input, "AV:L")
has_v3_field(input, "UI:R")
}

# v4: AV:L + UI:P (Passive user interaction)
ignore {
cvss_vector := get_v3_vector(input)
contains(cvss_vector, "/AV:L/")
contains(cvss_vector, "/UI:R")
has_v4_field(input, "AV:L")
has_v4_field(input, "UI:P")
}

# v4: AV:L + UI:A (Active user interaction)
ignore {
has_v4_field(input, "AV:L")
has_v4_field(input, "UI:A")
}

###############################################################################
Expand All @@ -126,15 +152,22 @@ ignore {
# accessible vulnerability requiring high privileges may still be
# relevant if the service runs as a privileged user.
#
# CVSS v3.1: AV:L + PR:H
# CVSS v4.0: AV:L + PR:H (same field names)
#
# Risk of false negative: Low, assuming containers run as non-root.
# If your containers run as root, REMOVE THIS RULE.
# Confidence: High (conditional on non-root execution)
###############################################################################

ignore {
cvss_vector := get_v3_vector(input)
contains(cvss_vector, "/AV:L/")
contains(cvss_vector, "/PR:H/")
has_v3_field(input, "AV:L")
has_v3_field(input, "PR:H")
}

ignore {
has_v4_field(input, "AV:L")
has_v4_field(input, "PR:H")
}

###############################################################################
Expand All @@ -154,60 +187,173 @@ ignore {
# the container-host boundary (e.g., container escape via kernel exploit)
# IS dangerous and is NOT ignored by this rule.
#
# CVSS v3.1: AV:L + S:U
# CVSS v4.0: AV:L + SC:N + SI:N + SA:N
# v4.0 replaced the binary S:U/S:C with three subsequent-component
# impact fields. SC:N + SI:N + SA:N means no impact on any component
# beyond the vulnerable one — equivalent to v3's S:U.
#
# Risk of false negative: Low. The theoretical concern is that AV:L+S:U
# could include reading mounted secrets, but an attacker with code
# execution can already read those secrets directly.
# Confidence: High
###############################################################################

# v3: AV:L + S:U
ignore {
cvss_vector := get_v3_vector(input)
contains(cvss_vector, "/AV:L/")
contains(cvss_vector, "/S:U/")
has_v3_field(input, "AV:L")
has_v3_field(input, "S:U")
}

# Also catch S:U at the end of the vector string (no trailing slash)
# v4: AV:L + no subsequent-component impact
ignore {
cvss_vector := get_v3_vector(input)
contains(cvss_vector, "/AV:L/")
endswith(cvss_vector, "/S:U")
has_v4_field(input, "AV:L")
has_v4_field(input, "SC:N")
has_v4_field(input, "SI:N")
has_v4_field(input, "SA:N")
}

###############################################################################
# HELPER FUNCTION: Extract the CVSS v3 vector string
# SECTION 6: AVAILABILITY-ONLY IMPACT, SCOPE UNCHANGED
#
# Rationale: CVEs where the only impact is availability (DoS/resource
# exhaustion) and scope is unchanged mean: processing crafted input can
# crash or hang the affected process, but cannot read data (C:N),
# modify data (I:N), or affect other components (S:U).
#
# In ephemeral batch containers, a DoS means a single pipeline job
# fails or hangs until it hits its timeout or memory limit. This is
# operationally equivalent to a corrupted input file or OOM — the job
# fails, the container is destroyed, and the next job runs on a fresh
# container. There is no persistent state corruption, no data
# exfiltration, and no lateral movement.
#
# This rule applies regardless of attack vector (including AV:N),
# because the impact is strictly contained: even if triggered by
# network-delivered data, the worst outcome is one failed job.
#
# NOTE: This does NOT ignore:
# - DoS with S:C / SC≠N / SI≠N / SA≠N (scope changed — could
# affect host or other containers)
# - DoS combined with any confidentiality or integrity impact
# (C≠N or I≠N), which could indicate data leaks or corruption
# alongside the crash
#
# CVSS v3.1: C:N + I:N + S:U (with any A value)
# CVSS v4.0: VC:N + VI:N + SC:N + SI:N + SA:N (with any VA value)
#
# Risk of false negative: Low. The concern would be if a DoS could be
# weaponized into a resource exhaustion attack against the compute
# platform (e.g., repeatedly submitting jobs with crafted inputs to
# burn credits). This is a business logic concern mitigated by job
# submission controls and cost alerts, not by container hardening.
# Confidence: High
###############################################################################

# v3: C:N + I:N + S:U (availability-only, scope unchanged)
ignore {
has_v3_field(input, "C:N")
has_v3_field(input, "I:N")
has_v3_field(input, "S:U")
}

# v4: VC:N + VI:N + no subsequent-component impact (availability-only)
ignore {
has_v4_field(input, "VC:N")
has_v4_field(input, "VI:N")
has_v4_field(input, "SC:N")
has_v4_field(input, "SI:N")
has_v4_field(input, "SA:N")
}

###############################################################################
# HELPER FUNCTIONS: Extract and match CVSS vector strings
#
# Trivy's JSON structure nests CVSS data under input.CVSS with vendor
# keys. The vector string location varies by data source. We check
# multiple common paths and prefer NVD.
#
# CVSS v3.1 vectors look like: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H
# CVSS v4.0 vectors look like: CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:H/SC:N/SI:N/SA:N
#
# Fields are slash-delimited key:value pairs. The has_vX_field helpers
# check for a field anywhere in the vector, handling both mid-string
# (/field/) and end-of-string (/field) positions.
#
# IMPORTANT: Run `trivy image --format json <your-image>` and inspect
# the .Vulnerabilities[].CVSS structure to confirm these paths work
# for your Trivy version. If the structure differs, update this
# function accordingly.
# for your Trivy version. If the structure differs, update these
# functions accordingly.
###############################################################################

# --- CVSS v3.1 vector extraction ---

get_v3_vector(vuln) = vector {
vector := vuln.CVSS.nvd.V3Vector
} else = vector {
vector := vuln.CVSS.redhat.V3Vector
} else = vector {
vector := vuln.CVSS.ghsa.V3Vector
} else = vector {
# Fallback: try any vendor that has a V3Vector
some vendor
vector := vuln.CVSS[vendor].V3Vector
} else = "" {
true
}

# --- CVSS v4.0 vector extraction ---

get_v4_vector(vuln) = vector {
vector := vuln.CVSS.nvd.V40Vector
} else = vector {
vector := vuln.CVSS.redhat.V40Vector
} else = vector {
vector := vuln.CVSS.ghsa.V40Vector
} else = vector {
some vendor
vector := vuln.CVSS[vendor].V40Vector
} else = "" {
true
}

# --- Field matching helpers ---
# Check if a CVSS vector contains a specific field value.
# Handles both mid-string (/AV:N/) and end-of-string (/AV:N) positions.

has_v3_field(vuln, field) {
cvss_vector := get_v3_vector(vuln)
cvss_vector != ""
contains(cvss_vector, concat("", ["/", field, "/"]))
}

has_v3_field(vuln, field) {
cvss_vector := get_v3_vector(vuln)
cvss_vector != ""
endswith(cvss_vector, concat("", ["/", field]))
}

has_v4_field(vuln, field) {
cvss_vector := get_v4_vector(vuln)
cvss_vector != ""
contains(cvss_vector, concat("", ["/", field, "/"]))
}

has_v4_field(vuln, field) {
cvss_vector := get_v4_vector(vuln)
cvss_vector != ""
endswith(cvss_vector, concat("", ["/", field]))
}

###############################################################################
# RULES INTENTIONALLY NOT INCLUDED (and why):
#
# 1. AV:N (Network attack vector) — NOT ignored.
# 1. AV:N (Network attack vector) — NOT blanket-ignored.
# Even though batch pipeline containers typically have no inbound
# listeners, some AV:N CVEs involve outbound connections triggered
# by processing attacker-influenced data (e.g., Log4Shell). We
# cannot safely blanket-ignore network-vector CVEs.
# cannot safely blanket-ignore network-vector CVEs. However,
# Section 6 does ignore AV:N CVEs that are availability-only with
# no scope change, since the worst outcome is a crashed job.
#
# 2. UI:R alone (without AV:L) — NOT ignored.
# Some AV:N + UI:R vulnerabilities involve scenarios like processing
Expand All @@ -233,4 +379,13 @@ get_v3_vector(vuln) = vector {
# boundaries (e.g., container escape via kernel exploit). These are
# dangerous even in ephemeral containers. Only AV:L + S:U (Scope
# Unchanged) is ignored — see Section 5 above.
#
# 6. Inbound-listener-only server CVEs — NOT categorically ignored.
# Many AV:N CVEs in fat JARs (Jetty, ZooKeeper, Netty server-side)
# require an active network listener that we never start. However,
# CVSS does not distinguish inbound-listener vs. data-processing
# attack surfaces within AV:N. Adding package-name-based exceptions
# here would be fragile and is better handled in .trivyignore with
# per-CVE justification documenting that the server component is
# never instantiated.
###############################################################################
29 changes: 29 additions & 0 deletions .trivyignore
Original file line number Diff line number Diff line change
Expand Up @@ -50,3 +50,32 @@
# ADDED: 2026-03-19
# -----------------------------------------------------------------------------
CVE-2026-23949

# -----------------------------------------------------------------------------
# CVE-2020-25649 — jackson-databind XXE in DOMDeserializer
# Severity: HIGH (AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:H/A:N)
# Package: com.fasterxml.jackson.core:jackson-databind 2.10.5
# Location: opt/conda/share/snpeff-5.2-3/snpEff.jar
#
# JUSTIFICATION:
# The vulnerable code path is DOMDeserializer, which is invoked when
# jackson-databind deserializes XML input into DOM objects without
# disabling external entity resolution. This enables XXE attacks
# (SSRF, local file read) when processing attacker-controlled XML.
#
# snpEff uses jackson-databind for JSON parsing only (config metadata,
# database indices). snpEff's input formats are VCF (tab-delimited),
# GenBank (flat text), and Java properties files. It never parses XML
# through Jackson's DOMDeserializer. The vulnerable code is present in
# the fat JAR but the code path is never traversed.
#
# This exclusion applies only to snpEff's bundled copy. If jackson-
# databind 2.10.5 appears in other JARs, this entry will NOT suppress
# those findings — Trivy matches .trivyignore by CVE ID globally, but
# we accept this because no other JAR in our images bundles this old
# version. If that changes, re-evaluate this entry.
#
# RESOLUTION: Upstream snpEff must update its bundled jackson-databind.
# ADDED: 2026-03-20
# -----------------------------------------------------------------------------
CVE-2020-25649
2 changes: 1 addition & 1 deletion docker/requirements/assemble-x86.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@

# Gap2Seq: Fills gaps in genome assemblies using short reads
# ARM64 build conflicts with pillow/libtiff due to old htslib/libdeflate pins
gap2seq=3.1.1a
gap2seq>=3.1.1a
Loading
Loading