Skip to content

Shell script hardening#95

Merged
cmdcolin merged 11 commits intomainfrom
shell-script-hardening
Apr 13, 2026
Merged

Shell script hardening#95
cmdcolin merged 11 commits intomainfrom
shell-script-hardening

Conversation

@cmdcolin
Copy link
Copy Markdown
Collaborator

Add more set -uno pipefail
Fix some incorrect synteny tracks having the assembly name e.g. from parsing filenames wrong over.chain.gz
Restore hash based file listings, we tried file size as it is faster but its just better to be safe than sorry

cmdcolin and others added 11 commits April 12, 2026 15:21
- Fix pushState → replaceState in useSearchFilter (search was adding history entries per keystroke)
- Add res.ok checks in useSearchIndex and useTaxonomyFilter fetchers
- Fix non-null assertion on nullable filter param in useCategoryFilter
- Remove dead typeof window checks in useCategoryFilter, useColumnVisibility, useTableSort
- Consolidate duplicate statusOrder keys to lowercase, normalize lookup with .toLowerCase()
- Fix XSS in recently-updated renderTable: replace innerHTML string interpolation with DOM APIs
- Sync HubEntry interface in recently-updated script block (add missing createdTimestamp)
- Replace as any cast in blog/index.astro with proper AstroComponentFactory typing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add set -euo pipefail to all scripts that were missing it
- Add set -o pipefail inside exported GFF-processing functions so
  pipelines fail properly in parallel subshells
- Fix bare $REPROCESS/$REDOWNLOAD/$REPROCESS_TRIX refs (need ${VAR:-}
  under set -u); fix ${1} -> ${1:-} in cleanupStaleGff.sh
- Fix /tmp/datasets_err hardcoded path in genark2jbrowse/make.sh
  (use mktemp instead)
- Fix ! $DRY_RUN && log patterns in cleanupStaleGff.sh that would
  trip set -e when DRY_RUN=true
- Remove unnecessary || true from textIndexGoldenPath.sh parallel call
  (process_assembly already handles failures gracefully)
- Switch .trackdb_hash and fileListing.txt from stat file-size to
  xxhsum (requires: sudo apt install xxhash); note existing .trackdb_hash
  files will all be treated as changed on first run

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
text-index failures are expected (track IDs can be missing from the
index), so the parallel run should never abort the parent pipeline.
Add --halt never to also be explicit about the intent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
wget exit code 7 (protocol error) when an assembly has no liftOver
directory was propagating through the pipeline after makePifs.sh got
set -euo pipefail. Fixes:

- extract_file_urls: capture wget output via $() so a failure returns
  empty string (exit 0) rather than crashing the pipeline
- "no chain files found" is now a soft skip (log_info + return 0)
  rather than log_error — many assemblies legitimately have no chains
- makePifs.sh: add --halt never + || true to both parallel calls as a
  backstop for any other per-assembly failures

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use a .checked stamp file in each liftOver/vs dir so subsequent runs
skip the HTTP directory fetch entirely. Pre-filter assembly lists in
make.sh/makePifs.sh before parallel to avoid forking processes for
already-stamped assemblies.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add make_file_listing() to common.sh: uses find -newer to skip
  unchanged files, merges with existing listing, drops deleted entries
- Switch from default XXH64 to XXH3 (-H3): 62 GB/s vs 19 GB/s on this
  machine; algo_tag header triggers full re-hash on algorithm change
- Simplify getFileListing.sh and ucsc2jbrowse/make.sh to single-line
  calls; remove duplicated inline logic
- Delete stale fileListing.txt files (XXH64 format) so next run
  rebuilds cleanly with XXH3

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ggregateTextSearchAdapters

- ~1700 liftOver synteny tracks with queryAssembly "over.chain.gz" removed;
  these were generated erroneously when assemblies had no liftOver directory
  (fixed by createChainTrackPifs patch in prior commit)
- aggregateTextSearchAdapters moved before plugins in minimal configs
  (cosmetic reordering, no content change)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…essions before createMinimalConfigs

createMinimalConfigs reads config.json and passes all top-level keys
through, but defaultSession wasn't present yet because generateDefaultSessions
ran after it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@cmdcolin cmdcolin merged commit b3c5e17 into main Apr 13, 2026
1 check failed
@cmdcolin cmdcolin deleted the shell-script-hardening branch April 13, 2026 21:08
cmdcolin added a commit that referenced this pull request Apr 14, 2026
Consolidate type checking into root tsconfig

- Add glob as explicit dependency (was transitive only)
- Remove website/tsconfig.json; root tsconfig covers it via the
  website/.astro/types.d.ts include which provides astro/client types
  (import.meta.glob etc.)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add vite/client reference to fix CI typecheck

website/.astro/types.d.ts is Astro-generated and absent in CI,
so ImportMeta.glob was untyped. Adding the reference to global.d.ts
ensures vite types are always loaded.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Taxonomy filters

Extract shared generate script and remove redundant buildonly

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Updates

Couple more filters

Updates

Updates

Updates

Shell script hardening (#95)

* Website code quality fixes

- Fix pushState → replaceState in useSearchFilter (search was adding history entries per keystroke)
- Add res.ok checks in useSearchIndex and useTaxonomyFilter fetchers
- Fix non-null assertion on nullable filter param in useCategoryFilter
- Remove dead typeof window checks in useCategoryFilter, useColumnVisibility, useTableSort
- Consolidate duplicate statusOrder keys to lowercase, normalize lookup with .toLowerCase()
- Fix XSS in recently-updated renderTable: replace innerHTML string interpolation with DOM APIs
- Sync HubEntry interface in recently-updated script block (add missing createdTimestamp)
- Replace as any cast in blog/index.astro with proper AstroComponentFactory typing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Harden shell scripts: set -euo pipefail, xxhash, safe var refs

- Add set -euo pipefail to all scripts that were missing it
- Add set -o pipefail inside exported GFF-processing functions so
  pipelines fail properly in parallel subshells
- Fix bare $REPROCESS/$REDOWNLOAD/$REPROCESS_TRIX refs (need ${VAR:-}
  under set -u); fix ${1} -> ${1:-} in cleanupStaleGff.sh
- Fix /tmp/datasets_err hardcoded path in genark2jbrowse/make.sh
  (use mktemp instead)
- Fix ! $DRY_RUN && log patterns in cleanupStaleGff.sh that would
  trip set -e when DRY_RUN=true
- Remove unnecessary || true from textIndexGoldenPath.sh parallel call
  (process_assembly already handles failures gracefully)
- Switch .trackdb_hash and fileListing.txt from stat file-size to
  xxhsum (requires: sudo apt install xxhash); note existing .trackdb_hash
  files will all be treated as changed on first run

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Restore || true for textIndexGoldenPath parallel call

text-index failures are expected (track IDs can be missing from the
index), so the parallel run should never abort the parent pipeline.
Add --halt never to also be explicit about the intent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix createChainTrackPifs failing on assemblies with no liftOver chains

wget exit code 7 (protocol error) when an assembly has no liftOver
directory was propagating through the pipeline after makePifs.sh got
set -euo pipefail. Fixes:

- extract_file_urls: capture wget output via $() so a failure returns
  empty string (exit 0) rather than crashing the pipeline
- "no chain files found" is now a soft skip (log_info + return 0)
  rather than log_error — many assemblies legitimately have no chains
- makePifs.sh: add --halt never + || true to both parallel calls as a
  backstop for any other per-assembly failures

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Skip PIF generation for already-processed assemblies

Use a .checked stamp file in each liftOver/vs dir so subsequent runs
skip the HTTP directory fetch entirely. Pre-filter assembly lists in
make.sh/makePifs.sh before parallel to avoid forking processes for
already-stamped assemblies.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Format

* Incremental file listing with XXH3: ~3x faster hashing

- Add make_file_listing() to common.sh: uses find -newer to skip
  unchanged files, merges with existing listing, drops deleted entries
- Switch from default XXH64 to XXH3 (-H3): 62 GB/s vs 19 GB/s on this
  machine; algo_tag header triggers full re-hash on algorithm change
- Simplify getFileListing.sh and ucsc2jbrowse/make.sh to single-line
  calls; remove duplicated inline logic
- Delete stale fileListing.txt files (XXH64 format) so next run
  rebuilds cleanly with XXH3

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Dry run output: remove bogus over.chain.gz liftOver tracks, reorder aggregateTextSearchAdapters

- ~1700 liftOver synteny tracks with queryAssembly "over.chain.gz" removed;
  these were generated erroneously when assemblies had no liftOver directory
  (fixed by createChainTrackPifs patch in prior commit)
- aggregateTextSearchAdapters moved before plugins in minimal configs
  (cosmetic reordering, no content change)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Update

* Fix defaultSession missing from minimal configs: run generateDefaultSessions before createMinimalConfigs

createMinimalConfigs reads config.json and passes all top-level keys
through, but defaultSession wasn't present yet because generateDefaultSessions
ran after it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add set -euo pipefail to remaining ucsc2jbrowse scripts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

Move files around

Add shfmt formatting and fix parallel pipeline failures

- Format all shell scripts with shfmt v3.13.1 (2-space indent)
- Add shfmt to format and lint:sh scripts in package.json
- Install shfmt v3.13.1 in CI lint workflow
- Add || true to parallel calls for per-assembly operations so individual
  failures (GFF processing, add-track, text-index, metadata) don't abort
  the whole pipeline
- Fix addNcbiGffAndTextIndex.sh to handle add-track failures with a warning
  and return, matching the inline version in make.sh

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Run astro sync before typecheck in CI to generate astro:content types

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Migrate to Astro 6 Content Layer API and add astro check to CI

- Move content config to src/content.config.ts with glob loader
- Replace deprecated post.render() with render(post) from astro:content
- Import z from astro/zod instead of deprecated astro:content export
- Add astro check step to CI to type-check .astro files

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Lints

Actionsup

Bump deps

Updates
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant