@@ -31,51 +31,54 @@ export RRC_DATA_DIR=/path/to/output
3131docker run --rm \
3232 -v $RRC_IMAGE_DIR :/data/images \
3333 -v $RRC_DATA_DIR :/data/output \
34- ghcr.io/reglab/rrc-pipeline:latest ingest
34+ ghcr.io/reglab/rrc-pipeline:latest rrc ingest
3535
3636# 2. Run OCR
3737docker run --rm --gpus all \
3838 -v $RRC_IMAGE_DIR :/data/images \
3939 -v $RRC_DATA_DIR :/data/output \
40- ghcr.io/reglab/rrc-pipeline:latest ocr
40+ ghcr.io/reglab/rrc-pipeline:latest rrc ocr
4141
4242# 3. Detect covenants
4343docker run --rm --gpus all \
4444 -v $RRC_IMAGE_DIR :/data/images \
4545 -v $RRC_DATA_DIR :/data/output \
46- ghcr.io/reglab/rrc-pipeline:latest detect
46+ ghcr.io/reglab/rrc-pipeline:latest rrc detect
4747
4848# 4. Export results
4949docker run --rm \
5050 -v $RRC_IMAGE_DIR :/data/images \
5151 -v $RRC_DATA_DIR :/data/output \
52- ghcr.io/reglab/rrc-pipeline:latest export
52+ ghcr.io/reglab/rrc-pipeline:latest rrc export
5353```
5454
5555## Pipeline Stages
5656
57- ### 1. Ingest (` ingest ` )
57+ > [ !NOTE]
58+ > To see all available commands, run ` docker run --rm ghcr.io/reglab/rrc-pipeline:latest rrc --help ` .
59+
60+ ### 1. Ingest (` rrc ingest ` )
5861- Scans input directory for image files (jpg, jpeg, png, tiff, tif, bmp)
5962- Validates images can be opened
6063- Handles multi-page TIFF files
6164- Creates database records for new images
6265
63- ### 2. OCR (` ocr ` )
66+ ### 2. OCR (` rrc ocr` )
6467- Transcribes images using the DocTR OCR library
6568- Requires GPU acceleration
6669- Processes only images without existing transcriptions
6770
68- ### 3. Detection (` detect ` )
71+ ### 3. Detection (` rrc detect` )
6972- Analyzes transcribed text using our Mistral-based covenant detection model
7073- Requires GPU acceleration
7174- Identifies presence of racial covenants and extracts relevant passages
7275- Processes only transcribed pages without existing predictions
7376
74- ### 4. Export (` export ` )
77+ ### 4. Export (` rrc export` )
7578- Exports detection results to CSV format
7679- Includes confidence scores and extracted covenant text where found
7780
78- ### 5. Pipeline Summary (` summarize ` )
81+ ### 5. Pipeline Summary (` rrc summarize` )
7982- Displays current pipeline progress and statistics
8083- Shows total page counts and processing status
8184- Reports covenant detection statistics
0 commit comments