LimsETL is a bioinformatics data extraction and processing pipeline that interfaces with IGO LIMS (Laboratory Information Management System) to retrieve genomic project metadata and generate standardized project files for downstream analysis pipelines.
- Extract project metadata from IGO LIMS via REST API
- Generate pipeline-specific project files for variant calling, RNA-seq, and ChIP-seq
- Support for multiple sequencing zones (JUNO, IRIS) with dynamic path handling
- Automated sample-to-FASTQ mapping with comprehensive metadata
- Request merging capabilities for multi-request projects
- Python 3.x
- R (for pipeline project generation)
- Access to IGO LIMS system
- Clone repository
- Install Python dependencies:
pip3 install -r requirements.txt
- Configure LIMS credentials:
Edit
cp conf.py.tmpl conf.py
conf.py
and set:self.LIMS_USERNAME="your_username" self.LIMS_PASSWORD="your_password"
python3 getProjectFiles.py <PROJECT_NUMBER>
Generates three output files:
Proj_<ID>_metadata.yaml
: Project-level metadataProj_<ID>_sample_mapping.txt
: Sample-to-file mappingsProj_<ID>_metadata_samples.csv
: Detailed sample metadata
- Environment zone detection for FASTQ path resolution
- Caching via
cachier
library for API performance - Robust null value handling in metadata parsing
- Primary IGO ID validation to prevent duplicates
cd UnitTests
./doUnitTest01.sh