Elasticsearch-powered search interface to browse publicly available corpora
Try it out here: HCDS Corpus Browser
-
Clone this repository:
git clone https://github.com/uhh-lt/corpus-browser.git -
Navigate to the docker directory:
cd corpus-browser/docker -
Adjust the environment variables
- Copy the .env.example file to .env:
cp .env.example .env - Change UID and GID to the output of
id
- Copy the .env.example file to .env:
-
Run
docker compose up -d -
Visit http://localhost:13100/ in your browser
- Navigate to the importer directory:
cd corpus-browser/importer - Install conda environment:
conda env create -f environment.yaml - Activate conda environment:
conda activate corpus-browser - Run
python importer.py --index germanu15 --input_dir ../data/uhh/json
- Navigate to the docker directory (
cd corpus-browser/docker) and removefrontendfromCOMPOSE_PROFILESin the.envfile - Start the docker containers:
docker compose up -d - Navigate to the frontend directory (
cd corpus-browser/frontend) and install all dependencies (npm i) - Start the frontend:
npm run dev - Visit http://localhost:5173/ in your browser