Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions .github/workflows/srsilo-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ env:

jobs:
rust-checks:
name: srSILO Rust Tools - Quality Checks
name: srSILO Updater - Quality Checks
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
Expand All @@ -26,35 +26,35 @@ jobs:
uses: actions/cache@v4
with:
path: ~/.cargo/registry
key: ${{ runner.os }}-cargo-registry-${{ hashFiles('roles/srsilo/files/tools/**/Cargo.lock') }}
key: ${{ runner.os }}-cargo-registry-${{ hashFiles('srsilo-updater/**/Cargo.lock') }}

- name: Cache cargo index
uses: actions/cache@v4
with:
path: ~/.cargo/git
key: ${{ runner.os }}-cargo-index-${{ hashFiles('roles/srsilo/files/tools/**/Cargo.lock') }}
key: ${{ runner.os }}-cargo-index-${{ hashFiles('srsilo-updater/**/Cargo.lock') }}

- name: Cache cargo build
uses: actions/cache@v4
with:
path: roles/srsilo/files/tools/target
key: ${{ runner.os }}-cargo-build-target-${{ hashFiles('roles/srsilo/files/tools/**/Cargo.lock') }}
path: srsilo-updater/target
key: ${{ runner.os }}-cargo-build-target-${{ hashFiles('srsilo-updater/**/Cargo.lock') }}

- name: Check formatting
run: cargo fmt --all -- --check
working-directory: roles/srsilo/files/tools
working-directory: srsilo-updater

- name: Run clippy
run: cargo clippy --workspace --all-targets -- -D warnings
working-directory: roles/srsilo/files/tools
working-directory: srsilo-updater

- name: Build
run: cargo build --release --workspace
working-directory: roles/srsilo/files/tools
working-directory: srsilo-updater

- name: Test
run: cargo test --workspace
working-directory: roles/srsilo/files/tools
working-directory: srsilo-updater

# srsilo-integration:
# name: srSILO Pipeline - Integration Test
Expand Down
21 changes: 19 additions & 2 deletions docs/architecture/srsilo-pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ Automated multi-virus genomic data processing: monitors LAPIS API for new sequen

## Directory Structure

**Deployment:**
```
/opt/srsilo/
├── covid/ # SARS-CoV-2 instance
Expand All @@ -24,10 +25,26 @@ Automated multi-virus genomic data processing: monitors LAPIS API for new sequen
│ ├── .last_update # Timestamp checkpoint
│ └── sorted.ndjson.zst # Merged data
├── rsva/ # RSV-A instance (same structure)
└── tools/ # Shared Rust binaries
└── tools/ # Deployed Rust binaries (from srsilo-updater)
└── target/release/
```

**Repository:**
```
WisePulse/
├── srsilo-updater/ # Rust workspace - data pipeline tools
│ ├── src/
│ │ ├── add_offset/
│ │ ├── check_new_data/
│ │ ├── fetch_silo_data/
│ │ ├── merge_sorted_chunks/
│ │ └── split_into_sorted_chunks/
│ ├── Cargo.toml
│ └── Cargo.lock
├── roles/srsilo/ # Ansible role
└── playbooks/srsilo/ # Ansible playbooks
```

## Components

**Playbooks:**
Expand All @@ -37,7 +54,7 @@ Automated multi-virus genomic data processing: monitors LAPIS API for new sequen
- `setup.yml` - Initial setup
- `setup-timer.yml` - Configure systemd timer

**Rust Tools:** `check_new_data`, `fetch_silo_data`, `split_into_sorted_chunks`, `merge_sorted_chunks`, `add_offset`
**Rust Tools (srsilo-updater):** `check_new_data`, `fetch_silo_data`, `split_into_sorted_chunks`, `merge_sorted_chunks`, `add_offset`

**Docker:** SILO (genspectrum/lapis-silo), LAPIS API (genspectrum/lapis)

Expand Down
4 changes: 2 additions & 2 deletions roles/srsilo/tasks/prerequisites.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,9 +68,9 @@
mode: '0750'
become: yes

- name: Copy tools directory from repository to base path
- name: Copy srsilo-updater from repository to deployment path
copy:
src: "{{ wisepulse_repo_path }}/roles/srsilo/files/tools/"
src: "{{ wisepulse_repo_path }}/srsilo-updater/"
dest: "{{ srsilo_tools_path }}/"
remote_src: yes
become: yes
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
47 changes: 47 additions & 0 deletions srsilo-updater/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# srSILO Updater

Rust-based data pipeline tools for the srSILO multi-virus genomic database.

## Overview

This workspace contains 5 binaries that power the srSILO data pipeline:

- `check_new_data` - Queries LAPIS API to detect new sequence data
- `fetch_silo_data` - Downloads NDJSON data from LAPIS API
- `split_into_sorted_chunks` - Splits large datasets for parallel processing
- `merge_sorted_chunks` - Merges sorted chunks into final dataset
- `add_offset` - Adjusts timestamps for incremental updates

## Building

```bash
# Development build
cargo build

# Release build (used in deployment)
cargo build --release
```

## Testing

```bash
cargo test --workspace
```

## Code Quality

```bash
# Format check
cargo fmt --all -- --check

# Linting
cargo clippy --workspace --all-targets -- -D warnings
```

## Deployment

These tools are deployed to `/opt/srsilo/tools/` on srSILO hosts via Ansible. See the main WisePulse documentation for deployment instructions.

## CI/CD

Quality checks run automatically on all PRs via GitHub Actions (`.github/workflows/srsilo-ci.yml`).