Skip to content

hipster-philology/pyrrha

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

449 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pyrrha

Coverage Status Test library DOI

Pyrrha is a simple Python Flask WebApp to fasten the post-correction of lemmatized and morpho-syntactic tagged corpora.

How to cite

This web application and its maintenance is done by Julien Pilla (@MrGecko) and Thibault Clérice (@ponteineptique). As software is research please cite the software if you use it using the following informations:

@software{thibault_clerice_2019_3524771,
  author       = {Clérice, Thibault and Janès, Juliette and Pilla, Julien and Camps, Jean-Baptiste and Pinche, Ariane and Gille-Levenson, Matthias and Jolivet, Vincent},
  title        = {Pyrrha, A language independant post correction app for POS and lemmatization},
  month        = nov,
  year         = 2024,
  publisher    = {Zenodo},
  version      = {4.0.0},
  doi          = {10.5281/zenodo.2325427},
  url          = {https://doi.org/10.5281/zenodo.2325427}
}

They used Pyrrha

You can find a set of projects and papers that used us in the examples.bib file.

Update the translations

From the root directory, run:

pybabel compile -d translations

Demo

Pandora Post-Correction Editor

Install

Start by cloning the repository, and moving inside the created folder

git clone https://github.com/hipster-philology/pyrrha.git
cd pyrrha/

Create a virtual environment, source it, and run:

pip install -r requirements.txt
python manage.py --config dev db-create

Run

python manage.py --config dev run

Creating a new user locally

  1. Run the application
  2. Click register and register. Remember to note the user email you register with.
  3. Stop the application
  4. Run python manage.py edit-user [EMAIL] --confirm-mail --role Administrator or simply python manage.py edit-user [EMAIL] --confirm-mail if you don't want administrator role. Replace [EMAIL] with the mail you used. If you are simply running it for yourself, we would definitely recommend to use the Administrator role though.
  5. Run the application, login and enjoy !

Update the translations

From the root directory, run:

python manage.py translate compile

If you changed the template or variables

python manage.py translate update
# Change the translation and then do
python manage.py translate compile

If you want to add a language

python manage.py translate init fr
python manage.py translate update
python manage.py translate compile

CLI Reference

All commands are run via manage.py. Pass --config <name> to select the environment — dev (default), prod, or test.

python manage.py --config <env> <command> [options]

Database setup

Command Description
db-create Create the database (if it doesn't exist) and apply all Alembic migrations. Run this once on a fresh install.
db-recreate Drop and recreate the database. Destroys all data — do not use in production.
db-fixtures Load demo corpora (Wauchier, Floovant) for local testing.

Schema migrations (Alembic)

Pyrrha uses Alembic (via Flask-Migrate) for schema versioning. All db-migrate / db-upgrade commands below wrap Alembic.

Command Description
db-migrate -m "message" Auto-generate a new migration from model changes. Review the generated file in migrations/versions/ before applying.
db-upgrade [--revision head] Apply all pending migrations (default: up to head).
db-downgrade [--revision -1] Revert migrations (default: one step back).
db-current Show the currently applied revision.
db-history List all migrations and their status.
db-stamp <revision> Mark the database as being at revision without running any SQL. Use head when onboarding an existing database that was created before Alembic was introduced — this tells Alembic "the schema is already current".

Typical development workflow for a schema change:

# 1. Edit the model in app/models/
# 2. Generate the migration
python manage.py --config dev db-migrate -m "Add gloss column to word_token"
# 3. Review migrations/versions/<hash>_add_gloss_column_to_word_token.py
# 4. Apply it
python manage.py --config dev db-upgrade

Onboarding an existing production database:

# Run once after upgrading to a version that introduced Alembic
python manage.py --config prod db-stamp head
# From now on, use db-upgrade for all future schema changes
python manage.py --config prod db-upgrade

Database backup

python manage.py --config prod db-dump /backups/pyrrha_$(date +%Y%m%d).dump

For PostgreSQL, this calls pg_dump --format=custom and writes a binary dump that can be restored with pg_restore. For SQLite, it copies the database file.

Corpus management

Command Description
corpus-list List all corpora with their IDs.
corpus-from-file NAME --corpus FILE [--lemma FILE] [--POS FILE] [--morph FILE] [--left N] [--right N] Create a corpus from a TSV token file plus optional allowed-value lists.
corpus-from-dir NAME <dir> Create a corpus from a directory containing tokens.csv, allowed_lemma.txt, allowed_pos.txt, allowed_morph.csv.
corpus-dump <ID> --path <dir> Export a corpus (tokens + allowed values) to a directory. Use corpus-list to find the ID.

User management

# Confirm email and grant Administrator role
python manage.py edit-user someone@example.com --confirm-mail --role Administrator

# Confirm email only
python manage.py edit-user someone@example.com --confirm-mail

How to contribute

Maintainers

Past maintainers

Contributors

Source

This app is wished to be simple and local at the moment (No User system). But to keep in the abilities to extend and use other systems, we based some of our decisions on https://github.com/hack4impact/flask-base/ and the general structure is following theirs.

About

A language-independent post-correction app for POS-tagging and lemmatization

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors