Skip to content

Commit cf0bdd9

Browse files
committed
fix(sbom,ci): pin verifier check (C) to bomsh-traced gitoid; codespell
`make sbom`'s libtool relink rewrites `src/.libs/lib*.so*` after bomsh has already gitoid-ed it, breaking check (C) on the previous push. Capture the gitoid in `make bomsh` BEFORE `make sbom`, persist `<path>\t<gitoid>` to `_bomsh.artefact`, and have check (C) compare SPDX vs the saved gitoid (NOTE if on-disk has diverged). Plus two codespell typo fixes (`unparseable` → `unparsable`). Signed-off-by: Sameeh Jubran <sameeh@wolfssl.com>
1 parent 284a9b2 commit cf0bdd9

3 files changed

Lines changed: 183 additions & 67 deletions

File tree

Makefile.am

Lines changed: 33 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -475,10 +475,16 @@ BOMSH_CONF = $(abs_builddir)/_bomsh.conf
475475
BOMSH_OMNIBORDIR = $(abs_builddir)/omnibor
476476
BOMSH_SPDX_OUT = omnibor.wolfssl-$(PACKAGE_VERSION).spdx.json
477477
# Single-source-of-truth manifest of the library artefact bomtrace3
478-
# actually traced. Written by the bomsh: recipe so downstream
479-
# verification (CI: `Bomsh provenance is end-to-end verifiable`) doesn't
480-
# have to re-derive the same WOLFSSL_LIB_DSO_BASENAMES priority order
481-
# in parallel and risk drift.
478+
# actually traced. Format: one line, '<path>\t<gitoid>'. Both fields
479+
# are captured by the bomsh: recipe right after bomtrace3 finishes, so
480+
# downstream verification (CI: `Bomsh provenance is end-to-end
481+
# verifiable`) compares the SPDX gitoid against the gitoid bomsh
482+
# itself recorded -- decoupling check (C) from the file's *current*
483+
# bytes, which `make sbom`'s subsequent `make install` step relinks
484+
# in place via libtool (RPATH fixup), changing the gitoid that would
485+
# be re-computed off the on-disk file. The verifier still warns when
486+
# the on-disk gitoid disagrees, so the install-time relink remains
487+
# visible.
482488
BOMSH_ARTEFACT_MANIFEST = $(abs_builddir)/_bomsh.artefact
483489
bomshdir = $(datadir)/doc/$(PACKAGE)
484490

@@ -508,25 +514,39 @@ bomsh:
508514
@printf 'raw_logfile=%s\n' '$(BOMSH_RAWLOG_BASE)' > '$(BOMSH_CONF)'
509515
$(BOMTRACE3) -c '$(BOMSH_CONF)' $(MAKE)
510516
$(BOMSH_CREATE_BOM) -r '$(BOMSH_RAWLOG)' -b '$(BOMSH_OMNIBORDIR)'
511-
$(MAKE) sbom
512-
@if test -z "$(BOMSH_SBOM)"; then \
513-
echo "NOTE: bomsh_sbom.py not in PATH; skipping SPDX enrichment."; \
514-
echo " The OmniBOR graph in $(BOMSH_OMNIBORDIR) is still produced."; \
515-
exit 0; \
516-
fi; \
517-
bomsh_artifact=""; \
517+
@# Capture the gitoid of the bomtrace3-traced library BEFORE the
518+
@# `make sbom` below, which calls `make install DESTDIR=...` --
519+
@# libtool's --mode=install relinks src/.libs/lib*.so* in place
520+
@# to fix RPATH, mutating the bytes that bomsh recorded in the
521+
@# ADG via bomsh_create_bom above. Capturing here pins the
522+
@# verifier's check (C) to the bomsh-traced gitoid (so SPDX <->
523+
@# manifest agree even though the on-disk bytes diverge after
524+
@# install). The on-disk divergence is surfaced as a verifier
525+
@# warning, not a failure.
526+
@bomsh_artifact=""; \
518527
for lib in \
519528
$(addprefix "$(abs_builddir)/src/.libs"/,$(WOLFSSL_LIB_DSO_BASENAMES)) \
520529
"$(abs_builddir)/src/.libs/libwolfssl.a" \
521530
"$(abs_builddir)/src/libwolfssl.a"; do \
522531
if test -f "$$lib"; then bomsh_artifact="$$lib"; break; fi; \
523532
done; \
524-
if test -z "$$bomsh_artifact"; then \
533+
if test -n "$$bomsh_artifact"; then \
534+
bomsh_artifact_gid=`$(PYTHON3) -c 'import hashlib,sys;d=open(sys.argv[1],"rb").read();h=hashlib.sha1();h.update(("blob %d\0"%len(d)).encode());h.update(d);print(h.hexdigest())' "$$bomsh_artifact"`; \
535+
printf '%s\t%s\n' "$$bomsh_artifact" "$$bomsh_artifact_gid" \
536+
> '$(BOMSH_ARTEFACT_MANIFEST)'; \
537+
fi
538+
$(MAKE) sbom
539+
@if test -z "$(BOMSH_SBOM)"; then \
540+
echo "NOTE: bomsh_sbom.py not in PATH; skipping SPDX enrichment."; \
541+
echo " The OmniBOR graph in $(BOMSH_OMNIBORDIR) is still produced."; \
542+
exit 0; \
543+
fi; \
544+
if test ! -f '$(BOMSH_ARTEFACT_MANIFEST)'; then \
525545
echo "NOTE: no built libwolfssl artifact found in $(abs_builddir)/src/.libs/"; \
526546
echo " OmniBOR graph produced; SPDX enrichment skipped."; \
527547
exit 0; \
528548
fi; \
529-
printf '%s\n' "$$bomsh_artifact" > '$(BOMSH_ARTEFACT_MANIFEST)'; \
549+
bomsh_artifact=`awk 'NR==1 {print $$1}' '$(BOMSH_ARTEFACT_MANIFEST)'`; \
530550
echo "Enriching SPDX with OmniBOR ExternalRefs (artifact: $$bomsh_artifact)..."; \
531551
$(BOMSH_SBOM) \
532552
-b '$(BOMSH_OMNIBORDIR)' \

scripts/bomsh_verify.py

Lines changed: 86 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -15,17 +15,26 @@
1515
a downstream verifier weeks later.
1616
1717
(C) Artefact correspondence -- the gitoid recorded against the
18-
wolfSSL package equals the git-blob hash of the actual library
19-
artefact that `make bomsh` traced (read from the
20-
`_bomsh.artefact` manifest written by the bomsh: Makefile
21-
target). This is what makes the SBOM a true attestation of
22-
the binary that would ship.
18+
wolfSSL package equals the gitoid bomsh itself recorded for the
19+
library it traced (read from the `_bomsh.artefact` manifest the
20+
bomsh: Makefile target writes as '<path>\\t<gitoid>' BEFORE
21+
`make sbom` runs). This is the strongest claim the bomsh
22+
pipeline alone can make: the SPDX agrees with what bomsh saw.
23+
24+
Comparing against bomsh's own recorded gitoid (rather than
25+
against the on-disk file's *current* bytes) is deliberate.
26+
`make sbom`'s subsequent `make install` step relinks
27+
src/.libs/lib*.so* in place via libtool to fix RPATH, mutating
28+
the bytes after bomsh has already gitoid-ed them. The verifier
29+
still hashes the on-disk file and emits a NOTE if it has
30+
diverged, so the install-time relink remains visible without
31+
causing a false negative on the bomsh<->SPDX agreement.
2332
2433
Without this, a future `bomsh_sbom.py` change that emits a
2534
plausibly-shaped but fictional gitoid (one that does not resolve in
26-
the ADG, or resolves but to the wrong artefact) would pass the
27-
existing PERSISTENT-ID assertion and ship a provenance bundle whose
28-
externalRef is a lie.
35+
the ADG, or resolves but to a different artefact than bomsh recorded)
36+
would pass the existing PERSISTENT-ID assertion and ship a provenance
37+
bundle whose externalRef is a lie.
2938
3039
CLI form (used by `.github/workflows/sbom.yml`):
3140
@@ -150,27 +159,51 @@ def check_object_store_integrity(omnibor_objects_dir):
150159
return obj_count, bad
151160

152161

153-
def check_artefact_correspondence(spdx_gitoids, artefact_path,
154-
package_name_substr='wolfssl'):
155-
"""(C) The gitoid recorded against the wolfSSL package equals the
156-
git-blob hash of the library artefact at <artefact_path>.
157-
158-
Returns (artefact_gid, wolfssl_gids). Caller checks
159-
`artefact_gid in wolfssl_gids`. Raises FileNotFoundError if the
160-
artefact does not exist; raises ValueError if no SPDX gitoid is
161-
associated with a wolfSSL package."""
162-
if not os.path.isfile(artefact_path):
162+
def parse_artefact_manifest(manifest_path):
163+
"""Parse the `_bomsh.artefact` manifest written by the bomsh:
164+
recipe. Format: a single line, `<absolute-path>\\t<gitoid-hex>`
165+
-- both fields captured by the recipe AFTER bomtrace3 finishes
166+
but BEFORE `make sbom` relinks the library.
167+
168+
Returns (path, recorded_gid). Raises FileNotFoundError if the
169+
manifest does not exist (bomsh: skipped artefact discovery, e.g.
170+
no built library); raises ValueError if the line is malformed."""
171+
if not os.path.isfile(manifest_path):
163172
raise FileNotFoundError(
164-
f'artefact {artefact_path!r} does not exist')
165-
artefact_gid = gitoid_sha1(artefact_path)
173+
f'{manifest_path} not produced by `make bomsh`; cannot '
174+
f'verify gitoid <-> artefact correspondence. This usually '
175+
f'means the bomsh enrichment step skipped the artefact-'
176+
f'discovery loop (no built library).')
177+
with open(manifest_path) as f:
178+
line = f.readline().rstrip('\n')
179+
if not line:
180+
raise ValueError(
181+
f'{manifest_path} is empty; bomsh: recipe wrote nothing')
182+
parts = line.split('\t')
183+
if len(parts) != 2 or not all(parts):
184+
raise ValueError(
185+
f'{manifest_path}: expected "<path>\\t<gitoid>", got {line!r}. '
186+
f'Re-run `make bomsh` against an up-to-date Makefile.am.')
187+
return parts[0], parts[1]
188+
189+
190+
def check_artefact_correspondence(spdx_gitoids, recorded_gid,
191+
package_name_substr='wolfssl'):
192+
"""(C) The gitoid bomsh recorded for the traced library matches a
193+
gitoid externalRef on the wolfSSL SPDX package. This is the
194+
bomsh<->SPDX agreement check; it does NOT compare against the
195+
on-disk file's current bytes (see module docstring).
196+
197+
Returns (matched, wolfssl_gids). Raises ValueError if no SPDX
198+
gitoid is associated with a wolfSSL-named package."""
166199
wolfssl_gids = [gid for name, gid in spdx_gitoids
167200
if package_name_substr in name.lower()]
168201
if not wolfssl_gids:
169202
raise ValueError(
170203
f'no SPDX gitoid externalRef on a package whose name '
171204
f'contains {package_name_substr!r}; cannot verify '
172205
f'artefact correspondence')
173-
return artefact_gid, wolfssl_gids
206+
return recorded_gid in wolfssl_gids, wolfssl_gids
174207

175208

176209
def verify(spdx_glob, omnibor_dir, artefact_manifest,
@@ -214,39 +247,50 @@ def verify(spdx_glob, omnibor_dir, artefact_manifest,
214247
f'round-trip (object store is corrupt)')
215248
return False, messages
216249

217-
if not os.path.isfile(artefact_manifest):
218-
messages.append(
219-
f'{artefact_manifest} not produced by `make bomsh`; '
220-
f'cannot verify gitoid <-> artefact correspondence. '
221-
f'This usually means the bomsh enrichment step skipped '
222-
f'the artefact-discovery loop (no built library).')
223-
return False, messages
224-
with open(artefact_manifest) as f:
225-
artefact = f.read().strip()
226-
if not artefact:
227-
messages.append(
228-
f'{artefact_manifest} is empty; bomsh: recipe wrote a '
229-
f'blank path')
250+
try:
251+
artefact, recorded_gid = parse_artefact_manifest(artefact_manifest)
252+
except (FileNotFoundError, ValueError) as e:
253+
messages.append(str(e))
230254
return False, messages
231255

232256
try:
233-
artefact_gid, wolfssl_gids = check_artefact_correspondence(
234-
spdx_gitoids, artefact, package_name_substr)
235-
except (FileNotFoundError, ValueError) as e:
257+
matched, wolfssl_gids = check_artefact_correspondence(
258+
spdx_gitoids, recorded_gid, package_name_substr)
259+
except ValueError as e:
236260
messages.append(str(e))
237261
return False, messages
238262

239-
if artefact_gid not in wolfssl_gids:
263+
if not matched:
240264
messages.append(
241265
f'wolfSSL package SPDX gitoids {wolfssl_gids} do not '
242-
f'include the gitoid of the actual built artefact '
243-
f'{artefact} ({artefact_gid}); the SBOM does not '
244-
f'attest to the binary that would ship')
266+
f'include the gitoid bomsh recorded for the traced '
267+
f'artefact {artefact} ({recorded_gid}); the SBOM is '
268+
f'inconsistent with what bomsh actually saw')
245269
return False, messages
246270

247271
messages.append(f'OK: {len(spdx_gitoids)} gitoid(s) verified')
248272
messages.append(f' objects round-trip: {obj_count} blobs')
249-
messages.append(f' artefact match: {artefact} -> {artefact_gid}')
273+
messages.append(
274+
f' artefact match: {artefact} -> {recorded_gid} (bomsh-traced)')
275+
276+
# Diagnostic-only: the on-disk file may have been rewritten since
277+
# bomsh saw it (the canonical case is `make sbom`'s `make install`
278+
# step relinking via libtool to fix RPATH). We do NOT fail on
279+
# this -- the SBOM<->bomsh agreement above is what matters for
280+
# the provenance proof -- but surfacing it as a NOTE keeps the
281+
# divergence visible so it does not silently grow into a
282+
# bigger gap (e.g. someone adds a strip step that goes unflagged).
283+
if os.path.isfile(artefact):
284+
on_disk = gitoid_sha1(artefact)
285+
if on_disk != recorded_gid:
286+
messages.append(
287+
f'NOTE: on-disk {artefact} now has gitoid {on_disk}, '
288+
f'but bomsh recorded {recorded_gid}. This is expected '
289+
f'when `make sbom` runs `make install` (libtool relinks '
290+
f'src/.libs/lib*.so* in place to fix RPATH). The SBOM '
291+
f'attests to the bomsh-traced bytes; if you need it to '
292+
f'attest to the *installed* bytes, the bomsh: recipe '
293+
f'must trace `make install` too.')
250294
return True, messages
251295

252296

scripts/test_gen_sbom.py

Lines changed: 64 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -400,7 +400,7 @@ def test_no_gpl_mention_returns_none_with_warning(self):
400400
'Permission is hereby granted, free of charge, ...\n')
401401
self.assertIsNone(result)
402402
# Warning must mention the file path so an operator running
403-
# `make sbom` can see which file was unparseable.
403+
# `make sbom` can see which file was unparsable.
404404
self.assertIn('no GPL version found', stderr.getvalue())
405405

406406
def test_missing_file_returns_none_with_warning(self):
@@ -1898,15 +1898,26 @@ def __init__(self, tmpdir):
18981898
# check (B) actually exercises its loop).
18991899
self.artefact_content = b'\x7fELF...wolfssl shared library content...'
19001900
self.artefact_path.write_bytes(self.artefact_content)
1901-
self.manifest_path.write_text(str(self.artefact_path) + '\n')
19021901
self.aux_blobs = [b'/* aes.c */\n', b'/* sha.c */\n']
19031902
self.gitoids = {
19041903
'wolfssl': self._stage_blob(self.artefact_content),
19051904
}
19061905
for i, content in enumerate(self.aux_blobs):
19071906
self.gitoids[f'aux{i}'] = self._stage_blob(content)
1907+
# Manifest mirrors the new bomsh: recipe format: a single line,
1908+
# '<path>\t<gitoid>'. The gitoid is captured BEFORE `make sbom`
1909+
# would rewrite the file (libtool relink), so it pins the
1910+
# bomsh-traced bytes rather than the on-disk current bytes --
1911+
# decoupling check (C) from libtool's install-time rewrite.
1912+
self.write_manifest(self.artefact_path, self.gitoids['wolfssl'])
19081913
self._write_spdx()
19091914

1915+
def write_manifest(self, path, gid):
1916+
"""Helper so individual tests can rewrite the manifest with a
1917+
deliberately-wrong path or gitoid without re-reading
1918+
the recipe's exact format."""
1919+
self.manifest_path.write_text(f'{path}\t{gid}\n')
1920+
19101921
def _stage_blob(self, content):
19111922
"""Write `content` into omnibor/objects/<aa>/<rest> at the
19121923
correct gitoid path; return the gitoid hex. Uses
@@ -2034,26 +2045,67 @@ def test_artefact_manifest_missing_fails_check_C(self):
20342045
self.assertIn('not produced by `make bomsh`', joined)
20352046

20362047
def test_artefact_gid_mismatch_fails_check_C(self):
2037-
# Manifest points at a different file (e.g. the recipe ran the
2038-
# discovery loop on a stale build). The verifier must reject
2039-
# and surface BOTH the SPDX gitoid set and the actual artefact
2040-
# gitoid so the operator can diff them.
2048+
# Manifest records a gitoid that does NOT match any wolfSSL
2049+
# SPDX externalRef. This is the canonical "bomsh recorded X
2050+
# but bomsh_sbom enriched the SPDX with a different gitoid Y"
2051+
# bug -- exactly what check (C) is here to catch. The failure
2052+
# message must surface both the SPDX gitoid set AND the
2053+
# bomsh-recorded gitoid so the operator can diff them.
2054+
with tempfile.TemporaryDirectory() as tmpdir:
2055+
fx = _BomshFixture(tmpdir)
2056+
fake_gid = 'f' * 40
2057+
fx.write_manifest(fx.artefact_path, fake_gid)
2058+
ok, messages = fx.verify()
2059+
self.assertFalse(ok)
2060+
joined = '\n'.join(messages)
2061+
self.assertIn('inconsistent with what bomsh actually saw',
2062+
joined)
2063+
self.assertIn(fake_gid, joined)
2064+
self.assertIn(fx.gitoids['wolfssl'], joined)
2065+
2066+
def test_on_disk_divergence_emits_note_but_passes(self):
2067+
# The on-disk artefact bytes have changed since bomsh recorded
2068+
# them (the canonical libtool-relink case). Check (C) compares
2069+
# against the manifest's bomsh-recorded gitoid (which still
2070+
# matches the SPDX), so the verifier must PASS, but it must
2071+
# also emit a NOTE so the divergence is not silently hidden.
2072+
# Pinning this is the contract that makes the install-time
2073+
# relink visible without breaking CI.
2074+
with tempfile.TemporaryDirectory() as tmpdir:
2075+
fx = _BomshFixture(tmpdir)
2076+
# Rewrite the on-disk artefact AFTER the fixture pinned
2077+
# its gitoid in the manifest -- simulates `make sbom`'s
2078+
# `make install` relink.
2079+
fx.artefact_path.write_bytes(b'post-relink RPATH-fixed bytes')
2080+
ok, messages = fx.verify()
2081+
self.assertTrue(ok, f'verifier failed despite agreement: {messages}')
2082+
joined = '\n'.join(messages)
2083+
self.assertIn('NOTE:', joined)
2084+
self.assertIn('libtool relinks', joined)
2085+
# Both gitoids surfaced so triage doesn't need a second pass.
2086+
self.assertIn(fx.gitoids['wolfssl'], joined)
2087+
2088+
def test_manifest_path_only_legacy_format_rejected(self):
2089+
# A manifest containing only the path (the pre-fix legacy
2090+
# format) must be rejected explicitly, with a message that
2091+
# tells the operator to re-run `make bomsh` against an
2092+
# up-to-date Makefile.am. Silent acceptance would re-introduce
2093+
# the false-positive failure mode the new format was designed
2094+
# to prevent.
20412095
with tempfile.TemporaryDirectory() as tmpdir:
20422096
fx = _BomshFixture(tmpdir)
2043-
wrong = pathlib.Path(tmpdir) / 'wrong-artefact.so'
2044-
wrong.write_bytes(b'totally different bytes')
2045-
fx.manifest_path.write_text(str(wrong) + '\n')
2097+
fx.manifest_path.write_text(str(fx.artefact_path) + '\n')
20462098
ok, messages = fx.verify()
20472099
self.assertFalse(ok)
20482100
joined = '\n'.join(messages)
2049-
self.assertIn('does not attest to the binary', joined)
2050-
self.assertIn('wolfssl', joined.lower())
2101+
self.assertIn('expected "<path>\\t<gitoid>"', joined)
2102+
self.assertIn('up-to-date Makefile.am', joined)
20512103

20522104
def test_unexpected_gitoid_locator_format_rejected(self):
20532105
# bomsh upstream switching from sha1 to sha256 would change
20542106
# the locator prefix. load_spdx_gitoids must raise so the
20552107
# maintainer is forced to update the verifier in lockstep,
2056-
# rather than silently accepting an unparseable value.
2108+
# rather than silently accepting an unparsable value.
20572109
with tempfile.TemporaryDirectory() as tmpdir:
20582110
fx = _BomshFixture(tmpdir)
20592111
spdx = json.loads(fx.spdx_path.read_text())

0 commit comments

Comments
 (0)