Skip to content

Commit 360634d

Browse files
committed
Updates
Consolidate type checking into root tsconfig - Add glob as explicit dependency (was transitive only) - Remove website/tsconfig.json; root tsconfig covers it via the website/.astro/types.d.ts include which provides astro/client types (import.meta.glob etc.) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Add vite/client reference to fix CI typecheck website/.astro/types.d.ts is Astro-generated and absent in CI, so ImportMeta.glob was untyped. Adding the reference to global.d.ts ensures vite types are always loaded. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Taxonomy filters Extract shared generate script and remove redundant buildonly Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Updates Couple more filters Updates Updates Updates Shell script hardening (#95) * Website code quality fixes - Fix pushState → replaceState in useSearchFilter (search was adding history entries per keystroke) - Add res.ok checks in useSearchIndex and useTaxonomyFilter fetchers - Fix non-null assertion on nullable filter param in useCategoryFilter - Remove dead typeof window checks in useCategoryFilter, useColumnVisibility, useTableSort - Consolidate duplicate statusOrder keys to lowercase, normalize lookup with .toLowerCase() - Fix XSS in recently-updated renderTable: replace innerHTML string interpolation with DOM APIs - Sync HubEntry interface in recently-updated script block (add missing createdTimestamp) - Replace as any cast in blog/index.astro with proper AstroComponentFactory typing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Harden shell scripts: set -euo pipefail, xxhash, safe var refs - Add set -euo pipefail to all scripts that were missing it - Add set -o pipefail inside exported GFF-processing functions so pipelines fail properly in parallel subshells - Fix bare $REPROCESS/$REDOWNLOAD/$REPROCESS_TRIX refs (need ${VAR:-} under set -u); fix ${1} -> ${1:-} in cleanupStaleGff.sh - Fix /tmp/datasets_err hardcoded path in genark2jbrowse/make.sh (use mktemp instead) - Fix ! $DRY_RUN && log patterns in cleanupStaleGff.sh that would trip set -e when DRY_RUN=true - Remove unnecessary || true from textIndexGoldenPath.sh parallel call (process_assembly already handles failures gracefully) - Switch .trackdb_hash and fileListing.txt from stat file-size to xxhsum (requires: sudo apt install xxhash); note existing .trackdb_hash files will all be treated as changed on first run Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Restore || true for textIndexGoldenPath parallel call text-index failures are expected (track IDs can be missing from the index), so the parallel run should never abort the parent pipeline. Add --halt never to also be explicit about the intent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix createChainTrackPifs failing on assemblies with no liftOver chains wget exit code 7 (protocol error) when an assembly has no liftOver directory was propagating through the pipeline after makePifs.sh got set -euo pipefail. Fixes: - extract_file_urls: capture wget output via $() so a failure returns empty string (exit 0) rather than crashing the pipeline - "no chain files found" is now a soft skip (log_info + return 0) rather than log_error — many assemblies legitimately have no chains - makePifs.sh: add --halt never + || true to both parallel calls as a backstop for any other per-assembly failures Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Skip PIF generation for already-processed assemblies Use a .checked stamp file in each liftOver/vs dir so subsequent runs skip the HTTP directory fetch entirely. Pre-filter assembly lists in make.sh/makePifs.sh before parallel to avoid forking processes for already-stamped assemblies. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Format * Incremental file listing with XXH3: ~3x faster hashing - Add make_file_listing() to common.sh: uses find -newer to skip unchanged files, merges with existing listing, drops deleted entries - Switch from default XXH64 to XXH3 (-H3): 62 GB/s vs 19 GB/s on this machine; algo_tag header triggers full re-hash on algorithm change - Simplify getFileListing.sh and ucsc2jbrowse/make.sh to single-line calls; remove duplicated inline logic - Delete stale fileListing.txt files (XXH64 format) so next run rebuilds cleanly with XXH3 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Dry run output: remove bogus over.chain.gz liftOver tracks, reorder aggregateTextSearchAdapters - ~1700 liftOver synteny tracks with queryAssembly "over.chain.gz" removed; these were generated erroneously when assemblies had no liftOver directory (fixed by createChainTrackPifs patch in prior commit) - aggregateTextSearchAdapters moved before plugins in minimal configs (cosmetic reordering, no content change) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Update * Fix defaultSession missing from minimal configs: run generateDefaultSessions before createMinimalConfigs createMinimalConfigs reads config.json and passes all top-level keys through, but defaultSession wasn't present yet because generateDefaultSessions ran after it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Add set -euo pipefail to remaining ucsc2jbrowse scripts Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Move files around Add shfmt formatting and fix parallel pipeline failures - Format all shell scripts with shfmt v3.13.1 (2-space indent) - Add shfmt to format and lint:sh scripts in package.json - Install shfmt v3.13.1 in CI lint workflow - Add || true to parallel calls for per-assembly operations so individual failures (GFF processing, add-track, text-index, metadata) don't abort the whole pipeline - Fix addNcbiGffAndTextIndex.sh to handle add-track failures with a warning and return, matching the inline version in make.sh Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Run astro sync before typecheck in CI to generate astro:content types Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Migrate to Astro 6 Content Layer API and add astro check to CI - Move content config to src/content.config.ts with glob loader - Replace deprecated post.render() with render(post) from astro:content - Import z from astro/zod instead of deprecated astro:content export - Add astro check step to CI to type-check .astro files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Lints Actionsup Bump deps Updates
1 parent 756d39f commit 360634d

490 files changed

Lines changed: 245917 additions & 278401 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/lint.yml

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -10,24 +10,35 @@ jobs:
1010
lint:
1111
runs-on: ubuntu-latest
1212
steps:
13-
- uses: actions/checkout@v4
13+
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
1414

15-
- uses: pnpm/action-setup@v2
15+
- uses: pnpm/action-setup@fc06bc1257f339d1d5d8b3a19a8cae5388b55320 # v5.0.0
1616
with:
17-
version: 8
17+
version: 10
1818

19-
- uses: actions/setup-node@v4
19+
- uses: actions/setup-node@53b83947a5a98c8d113130e565377fae1a50d02f # v6.3.0
2020
with:
21-
node-version: '20'
21+
node-version: '24'
2222
cache: 'pnpm'
2323

2424
- run: pnpm install
2525

26+
- name: Astro sync
27+
run: cd website && pnpm astro sync
28+
29+
- name: Astro check
30+
run: cd website && pnpm astro check
31+
2632
- name: TypeScript type check
2733
run: pnpm typecheck
2834

2935
- name: ESLint
3036
run: pnpm lint
3137

38+
- name: Install shfmt
39+
run: |
40+
curl -sSL "https://github.com/mvdan/sh/releases/download/v3.13.1/shfmt_v3.13.1_linux_amd64" -o /usr/local/bin/shfmt
41+
chmod +x /usr/local/bin/shfmt
42+
3243
- name: ShellCheck
3344
run: pnpm lint:sh

.prettierrc.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
}
1010
],
1111
"semi": false,
12+
"singleAttributePerLine": true,
1213
"singleQuote": true,
1314
"trailingComma": "all",
1415
"arrowParens": "avoid",

CLAUDE.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# jb2hubs
2+
3+
Monorepo that converts UCSC GenArk and UCSC browser hubs into JBrowse 2 configs,
4+
and serves them via a static website.
5+
6+
## Packages
7+
8+
- `website/` — Astro + React static site (pages: search, recently-updated,
9+
accession, taxonomy, hubs, synteny, etc.)
10+
- `genark2jbrowse/` — scripts + TS that process GenArk hubs into JBrowse configs
11+
- `ucsc2jbrowse/` — scripts + TS that convert UCSC track hubs into JBrowse
12+
configs
13+
- `hubtools/` — shared TS library used by the converter packages
14+
15+
## Key website internals
16+
17+
- `src/components/SearchPage.tsx` — client-side search over
18+
`public/searchIndex.json`
19+
- `src/pages/recently-updated.astro` — server-rendered table with category
20+
dropdown filter
21+
- `src/hooks/useSearchIndex.ts` — SWR fetch of the search index;
22+
`IndexEntry = [accession, commonName, scientificName, assemblyName, assemblyStatus, source, taxonId]`
23+
- `src/recentlyUpdated.json` — build-time generated data for recently-updated
24+
page

aws/config-merger/src/index.ts

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
1-
import { APIGatewayProxyEventV2, APIGatewayProxyResultV2 } from 'aws-lambda'
2-
31
import { mergeConfigs } from './merger.ts'
4-
import { JBrowseConfig, MergeOptions } from './types.ts'
2+
3+
import type { JBrowseConfig, MergeOptions } from './types.ts'
4+
import type {
5+
APIGatewayProxyEventV2,
6+
APIGatewayProxyResultV2,
7+
} from 'aws-lambda'
58

69
function addRelativeUris(config: unknown, baseUri: string) {
710
if (typeof config === 'object' && config !== null) {

aws/config-merger/src/merger.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
import {
1+
import type {
22
AggregateTextSearchAdapter,
33
Assembly,
44
JBrowseConfig,

aws/config-merger/test/merger.test.ts

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,8 @@ import { describe, expect, it } from 'vitest'
22

33
import { addRelativeUris, idToConfigUrl } from '../src/index.ts'
44
import { mergeConfigs } from '../src/merger.ts'
5-
import { Assembly, JBrowseConfig, SyntenyTrack } from '../src/types.ts'
5+
6+
import type { Assembly, JBrowseConfig, SyntenyTrack } from '../src/types.ts'
67

78
function makeAssembly(name: string, displayName?: string): Assembly {
89
return {

common.sh

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,3 +31,45 @@ ensure_dir() {
3131
mkdir -p "$1"
3232
}
3333
export -f ensure_dir
34+
35+
# Incrementally updates a hash listing file using XXH3.
36+
# Only re-hashes files newer than the listing; handles additions and deletions.
37+
# Usage: make_file_listing <listing> <find_dir> [extra_find_args...]
38+
make_file_listing() {
39+
local listing="$1" find_dir="$2"
40+
shift 2
41+
local extra_args=("$@")
42+
local algo="-H3"
43+
local algo_tag="# algo=xxh3"
44+
local tmp_new tmp_cur
45+
tmp_new=$(mktemp)
46+
tmp_cur=$(mktemp)
47+
48+
if [[ ! -f "$listing" ]] || ! head -1 "$listing" | grep -qF "$algo_tag"; then
49+
find "$find_dir" -type f "${extra_args[@]}" -exec xxhsum "$algo" {} + | sort -k2,2 >"${listing}.tmp"
50+
{
51+
echo "$algo_tag"
52+
cat "${listing}.tmp"
53+
} >"$listing"
54+
rm -f "${listing}.tmp" "$tmp_new" "$tmp_cur"
55+
return 0
56+
fi
57+
58+
find "$find_dir" -type f "${extra_args[@]}" -newer "$listing" -exec xxhsum "$algo" {} + >"$tmp_new"
59+
find "$find_dir" -type f "${extra_args[@]}" | sort >"$tmp_cur"
60+
61+
tail -n +2 "$listing" |
62+
awk 'NR==FNR{skip[$2]=1; next} !($2 in skip)' "$tmp_new" - |
63+
awk 'NR==FNR{exists[$1]=1; next} ($2 in exists)' "$tmp_cur" - |
64+
{
65+
cat
66+
cat "$tmp_new"
67+
} |
68+
sort -k2,2 >"${listing}.tmp"
69+
{
70+
echo "$algo_tag"
71+
cat "${listing}.tmp"
72+
} >"$listing"
73+
rm -f "${listing}.tmp" "$tmp_new" "$tmp_cur"
74+
}
75+
export -f make_file_listing

eslint.config.mjs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ export default tseslint.config(
1919
languageOptions: {
2020
parserOptions: {
2121
projectService: {
22-
allowDefaultProject: ['*.js'],
22+
allowDefaultProject: ['*.js', '*.mjs', 'website/*.mjs'],
2323
defaultProject: './tsconfig.json',
2424
},
2525
tsconfigRootDir: import.meta.dirname,

genark2jbrowse/addNcbiGffAndTextIndex.sh

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
#!/bin/bash
22

3+
set -euo pipefail
4+
35
source "$(dirname "$0")/common.sh"
46

57
# Define function to add a GFF track to a JBrowse 2 assembly and create a text index.
@@ -26,9 +28,12 @@ add_track_and_text_index() {
2628
return 0
2729
fi
2830

29-
jbrowse add-track --force "$gff_file_path" --out "$hub_dir" --load copy --indexFile "${gff_file_path}".csi --trackId "${accession}-ncbiGff" --name "NCBI RefSeq - RefSeq All (GFF)" --category "Genes and Gene Predictions" >/dev/null
31+
if ! jbrowse add-track --force "$gff_file_path" --out "$hub_dir" --load copy --indexFile "${gff_file_path}".csi --trackId "${accession}-ncbiGff" --name "NCBI RefSeq - RefSeq All (GFF)" --category "Genes and Gene Predictions" >/dev/null; then
32+
echo "Warning: add-track failed for $accession" >&2
33+
return
34+
fi
3035
# Check if trix folder exists
31-
if [ -d "$hub_dir/trix" ] && [ -z "$REDOWNLOAD" ] && [ -z "$REPROCESS" ] && [ -z "$REPROCESS_TRIX" ]; then
36+
if [ -d "$hub_dir/trix" ] && [ -z "${REDOWNLOAD:-}" ] && [ -z "${REPROCESS:-}" ] && [ -z "${REPROCESS_TRIX:-}" ]; then
3237
add_trix_adapter "$accession" "$config_file"
3338
else
3439
echo "Trix folder does not exist for $accession, running jbrowse text-index"
@@ -40,4 +45,4 @@ add_track_and_text_index() {
4045
# Export function for use with GNU Parallel
4146
export -f add_track_and_text_index
4247

43-
find bgz -name "*.gz" | parallel -j16 $PARALLEL_OPTS add_track_and_text_index
48+
find bgz -name "*.gz" | parallel -j16 $PARALLEL_OPTS add_track_and_text_index || true

genark2jbrowse/cleanupStaleGff.sh

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,18 +11,20 @@
1111
# ./cleanupStaleGff.sh --exec # actually delete the files
1212
#
1313

14+
set -euo pipefail
15+
1416
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
1517
ALL_JSON="$SCRIPT_DIR/processedHubJson/all.json"
1618
LOG_FILE="$SCRIPT_DIR/CLEANED.md"
1719
DRY_RUN=true
1820

19-
if [ "${1}" = "--exec" ]; then
21+
if [ "${1:-}" = "--exec" ]; then
2022
DRY_RUN=false
2123
fi
2224

2325
log() {
2426
if ! $DRY_RUN; then
25-
echo "$1" >> "$LOG_FILE"
27+
echo "$1" >>"$LOG_FILE"
2628
fi
2729
}
2830

@@ -41,7 +43,7 @@ delete_file() {
4143

4244
# Build a temp file of known GFF basenames for fast lookup
4345
known_basenames_file=$(mktemp)
44-
jq -r '.[].ncbiGff | select(. != null) | split("/") | last' "$ALL_JSON" > "$known_basenames_file"
46+
jq -r '.[].ncbiGff | select(. != null) | split("/") | last' "$ALL_JSON" >"$known_basenames_file"
4547

4648
cleanup() { rm -f "$known_basenames_file"; }
4749
trap cleanup EXIT
@@ -51,13 +53,13 @@ is_known() {
5153
}
5254

5355
if ! $DRY_RUN; then
54-
echo "# Cleanup log ($(date -u '+%Y-%m-%d %H:%M UTC'))" > "$LOG_FILE"
55-
echo "" >> "$LOG_FILE"
56+
echo "# Cleanup log ($(date -u '+%Y-%m-%d %H:%M UTC'))" >"$LOG_FILE"
57+
echo "" >>"$LOG_FILE"
5658
fi
5759

5860
# Remove leftover uncompressed .gff files in bgz/
5961
echo "=== Leftover uncompressed .gff files in bgz/ ==="
60-
! $DRY_RUN && log "## Leftover uncompressed .gff files in bgz/"
62+
if ! $DRY_RUN; then log "## Leftover uncompressed .gff files in bgz/"; fi
6163
for f in "$SCRIPT_DIR/bgz/"*.gff; do
6264
[ -f "$f" ] || continue
6365
delete_file "$f"
@@ -66,7 +68,10 @@ done
6668
# Remove .csi files in bgz/ with no corresponding .gz
6769
echo ""
6870
echo "=== Orphaned .csi files in bgz/ ==="
69-
! $DRY_RUN && log "" && log "## Orphaned .csi files in bgz/"
71+
if ! $DRY_RUN; then
72+
log ""
73+
log "## Orphaned .csi files in bgz/"
74+
fi
7075
for f in "$SCRIPT_DIR/bgz/"GC[FA]_*.gz.csi; do
7176
[ -f "$f" ] || continue
7277
if [ ! -f "${f%.csi}" ]; then
@@ -79,7 +84,10 @@ done
7984
for dir in gff bgz; do
8085
echo ""
8186
echo "=== GFF files not in listing ($dir/) ==="
82-
! $DRY_RUN && log "" && log "## GFF files not in listing ($dir/)"
87+
if ! $DRY_RUN; then
88+
log ""
89+
log "## GFF files not in listing ($dir/)"
90+
fi
8391

8492
for f in "$SCRIPT_DIR/$dir/"GC[FA]_*.gz; do
8593
[ -f "$f" ] || continue

0 commit comments

Comments
 (0)