Skip to content
This repository is currently being migrated. It's locked while the migration is in progress.

department-of-veterans-affairs/contention-classification-api

Repository files navigation

Contention Classification

Build and Push to ECR Continuous Integration Poetry Python Version from PEP 621 TOML security: bandit Checked with mypy Linting: Ruff

/contention-classification/expanded-contention-classification maps contention text and diagnostic codes from 526 submission to contention classification codes as defined in the Benefits Reference Data API.

Getting started

This service can be run standalone using Poetry for dependency management or using Docker.

Python using Poetry

Install Python 3.12.3

Mac Users: you can use pyenv to handle multiple python versions

brew install pyenv
pyenv install 3.12.3 #Installs latest version of python 3.12.3
pyenv global 3.12.3 # or don't do this if you want a different version available globally for your system

Mac Users: If python path hasn't been setup, you can put the following in your ~/.zshrc

export PYENV_ROOT="$HOME/.pyenv"
export PATH="$PYENV_ROOT/shims:$PATH"
if which pyenv > /dev/null; then eval "$(pyenv init -)"; fi" #Initalize pyenv in current shell session

Install Poetry

This project uses Poetry to manage dependencies.

Follow the directions on the Poetry website for installation:

curl -sSL https://install.python-poetry.org | python3 -

Install dependencies

Use Poetry to install all dependencies:

poetry install

Install pre-commit hooks

poetry run pre-commit install

To run the pre-commit hooks manually:

poetry run pre-commit run --all-files

Run the server

Using Poetry, run the FastAPI server:

poetry run uvicorn python_src.api:app --port 8120 --reload

Run tests

Using Poetry, run the test suite:

poetry run pytest

For test coverage report:

poetry run pytest --cov=src --cov-report=term-missing

Running with Docker

This application can also be run with Docker using the following commands.

docker compose down
docker compose build --no-cache
docker compose up -d

Running the ML Classifier locally

This is required for full functionality of endpoints hybrid-contention-classification and ml-contention-classification; if the ML classifier files are not present, the endpoints will return "classification_code": null and "classification_name": "no-model".`

The ML classifier runs in an ONNX environment that is loaded with two static files: a model (.onnx) and a vectorizer (.pkl). In the deployed k8s environments, these files are downloaded from AWS S3 using a system account, as described in a Deployment Configuration (internal wiki).

For the purposes of local development, these static files can be downloaded from the VA Sharepoint or the AWS S3. The locations for those files are described in the internal wiki. The app_config expects these files to be saved to a models/ directory, although this can be customized in ml_classifier section of the app_config

The .pkl file is not appropriate for use beyond the local dev environment due to known security weaknesses. As noted in official python documentation:

Warning: The pickle module is not secure. Only unpickle data you trust.

For non-local dev, an ONNX format is intended.

Neither the .pkl nor .onnx files should be committed to the GitHub repository, as we cannot guarantee that they are free of PII/PHI. As a precaution, both file extensions are flagged in .gitignore.

ML Model Integrity Verification (Optional)

The application includes SHA-256 checksum verification for ML model files to ensure file integrity. This feature can be configured through:

Development/Local Environment:

export ML_MODEL_SHA256=your_model_file_sha256_hash
export ML_VECTORIZER_SHA256=your_vectorizer_file_sha256_hash
export DISABLE_SHA_VERIFICATION=true  # To disable verification during development

Production Environment: In production, these environment variables should be configured through Helm charts rather than manual exports. The values are set in the deployment configuration (where: the env block in an environment's deployment.yaml) and applied through the Kubernetes deployment process.

These environment variables take precedence over the default checksums configured in app_config.yaml. The DISABLE_SHA_VERIFICATION flag allows bypassing verification when needed for development or testing purposes.

Testing locally

With the application running using either Docker or Python, tests requests can be sent using the following curl commands.

To test the health of the application or to check if the application is running at the contention-classification/health endpoint:

curl -X 'GET' 'http://localhost:8120/health'

To test the classification provided by the endpoint at contention-classification/expanded-contention-classification:

curl -X 'POST'   'http://localhost:8120/expanded-contention-classification'   -H 'accept: application/json'   -H 'Content-Type: application/json'   -d '{
  "claim_id": 44,
  "form526_submission_id": 55,
  "contentions": [
        {
            "contention_text": "PTSD (post-traumatic stress disorder)",
            "contention_type": "NEW"
        },
        {
            "contention_text": "acl tear, right",
            "contention_type": "NEW"
        },
        {
            "contention_text": "",
            "contention_type": "INCREASE",
            "diagnostic_code": 5012
        }
    ]
}'

To test the classification provided by the endpoint at contention-classification/ml-contention-classification: (note: absence of claim_id and form526_submission_id in the data posted in the request)

curl -X 'POST'   'http://localhost:8120/ml-contention-classification'   -H 'accept: application/json'   -H 'Content-Type: application/json'   -d '{
  "contentions": [
        {
            "contention_text": "PTSD (post-traumatic stress disorder)",
            "contention_type": "NEW"
        },
        {
            "contention_text": "acl tear, right",
            "contention_type": "NEW"
        },
        {
            "contention_text": "",
            "contention_type": "INCREASE",
            "diagnostic_code": 5012
        }
    ]
}'

To test the classification provided by the endpoint at contention-classification/hybrid-contention-classification:

curl -X 'POST'   'http://localhost:8120/hybrid-contention-classification'   -H 'accept: application/json'   -H 'Content-Type: application/json'   -d '{
  "claim_id": 44,
  "form526_submission_id": 55,
  "contentions": [
        {
            "contention_text": "lorem ipsum unclassifiable",
            "contention_type": "NEW"
        },
        {
            "contention_text": "acl tear, right",
            "contention_type": "NEW"
        },
        {
            "contention_text": "",
            "contention_type": "INCREASE",
            "diagnostic_code": 5012
        },
        {
            "contention_text": "",
            "contention_type": "INCREASE",
            "diagnostic_code": 7777777777777
        }
    ]
}'

An alternative to the above curl commands is to use a local testing application like Bruno or Postman. Different JSON request bodies can be set up for testing each of the above endpoints and tests can be saved using Collections within these tools.

Building docs

API Documentation is automatically created by FastAPI. This can be viewed by visiting localhost:8120/docs while the application is running.

For exporting the open API spec:

poetry run python src/python_src/util/pull_api_documentation.py

Deploying to VA Platform

Building the image and publishing to ECR

Images are built and pushed to ECR using the build_and_push_to_ecr.yml workflow which is triggered in one of two ways:

  • Automatically: when pushed when changes are pushed to the main branch, which should only be done when a Pull Request is merged into the main branch
  • Manually: by triggering the action in Github

This workflow is not triggered when changes are pushed to any branch other than the main branch.

Deploying the image

The image is released to the VA Platform using the release.yml workflow which is triggered when a new image is pushed to ECR. This workflow will deploy the latest image to the VA Platform automatically for the dev and staging environments. The sandbox and prod environments must be deployed manually by triggering the action in Github and selecting the desired environment(s).

Note that manually triggering the deployment will deploy the most recent commit hash to the selected environment(s).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors