/contention-classification/expanded-contention-classification maps contention text and diagnostic codes from 526 submission to contention classification codes as defined in the Benefits Reference Data API.
This service can be run standalone using Poetry for dependency management or using Docker.
Mac Users: you can use pyenv to handle multiple python versions
brew install pyenv
pyenv install 3.12.3 #Installs latest version of python 3.12.3
pyenv global 3.12.3 # or don't do this if you want a different version available globally for your systemMac Users: If python path hasn't been setup, you can put the following in your ~/.zshrc
export PYENV_ROOT="$HOME/.pyenv"
export PATH="$PYENV_ROOT/shims:$PATH"
if which pyenv > /dev/null; then eval "$(pyenv init -)"; fi" #Initalize pyenv in current shell sessionThis project uses Poetry to manage dependencies.
Follow the directions on the Poetry website for installation:
curl -sSL https://install.python-poetry.org | python3 -Use Poetry to install all dependencies:
poetry installpoetry run pre-commit installTo run the pre-commit hooks manually:
poetry run pre-commit run --all-filesUsing Poetry, run the FastAPI server:
poetry run uvicorn python_src.api:app --port 8120 --reloadUsing Poetry, run the test suite:
poetry run pytestFor test coverage report:
poetry run pytest --cov=src --cov-report=term-missingThis application can also be run with Docker using the following commands.
docker compose down
docker compose build --no-cache
docker compose up -d
This is required for full functionality of endpoints
hybrid-contention-classificationandml-contention-classification; if the ML classifier files are not present, the endpoints will return"classification_code": nulland"classification_name": "no-model".`
The ML classifier runs in an ONNX environment that is loaded with two static files: a model (.onnx) and a vectorizer (.pkl). In the deployed k8s environments, these files are downloaded from AWS S3 using a system account, as described in a Deployment Configuration (internal wiki).
For the purposes of local development, these static files can be downloaded from the VA Sharepoint or the AWS S3. The locations for those files are described in the internal wiki. The app_config expects these files to be saved to a models/ directory, although this can be customized in ml_classifier section of the app_config
The .pkl file is not appropriate for use beyond the local dev environment due to known security weaknesses. As noted in official python documentation:
Warning: The pickle module is not secure. Only unpickle data you trust.
For non-local dev, an ONNX format is intended.
Neither the .pkl nor .onnx files should be committed to the GitHub repository, as we cannot guarantee that they are free of PII/PHI. As a precaution, both file extensions are flagged in .gitignore.
The application includes SHA-256 checksum verification for ML model files to ensure file integrity. This feature can be configured through:
Development/Local Environment:
export ML_MODEL_SHA256=your_model_file_sha256_hash
export ML_VECTORIZER_SHA256=your_vectorizer_file_sha256_hash
export DISABLE_SHA_VERIFICATION=true # To disable verification during developmentProduction Environment:
In production, these environment variables should be configured through Helm charts rather than manual exports. The values are set in the deployment configuration (where: the env block in an environment's deployment.yaml) and applied through the Kubernetes deployment process.
These environment variables take precedence over the default checksums configured in app_config.yaml. The DISABLE_SHA_VERIFICATION flag allows bypassing verification when needed for development or testing purposes.
With the application running using either Docker or Python, tests requests can be sent using the following curl commands.
To test the health of the application or to check if the application is running at the contention-classification/health endpoint:
curl -X 'GET' 'http://localhost:8120/health'
To test the classification provided by the endpoint at contention-classification/expanded-contention-classification:
curl -X 'POST' 'http://localhost:8120/expanded-contention-classification' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{
"claim_id": 44,
"form526_submission_id": 55,
"contentions": [
{
"contention_text": "PTSD (post-traumatic stress disorder)",
"contention_type": "NEW"
},
{
"contention_text": "acl tear, right",
"contention_type": "NEW"
},
{
"contention_text": "",
"contention_type": "INCREASE",
"diagnostic_code": 5012
}
]
}'
To test the classification provided by the endpoint at contention-classification/ml-contention-classification:
(note: absence of claim_id and form526_submission_id in the data posted in the request)
curl -X 'POST' 'http://localhost:8120/ml-contention-classification' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{
"contentions": [
{
"contention_text": "PTSD (post-traumatic stress disorder)",
"contention_type": "NEW"
},
{
"contention_text": "acl tear, right",
"contention_type": "NEW"
},
{
"contention_text": "",
"contention_type": "INCREASE",
"diagnostic_code": 5012
}
]
}'
To test the classification provided by the endpoint at contention-classification/hybrid-contention-classification:
curl -X 'POST' 'http://localhost:8120/hybrid-contention-classification' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{
"claim_id": 44,
"form526_submission_id": 55,
"contentions": [
{
"contention_text": "lorem ipsum unclassifiable",
"contention_type": "NEW"
},
{
"contention_text": "acl tear, right",
"contention_type": "NEW"
},
{
"contention_text": "",
"contention_type": "INCREASE",
"diagnostic_code": 5012
},
{
"contention_text": "",
"contention_type": "INCREASE",
"diagnostic_code": 7777777777777
}
]
}'
An alternative to the above curl commands is to use a local testing application like Bruno or Postman. Different JSON request bodies can be set up for testing each of the above endpoints and tests can be saved using Collections within these tools.
API Documentation is automatically created by FastAPI. This can be viewed by visiting localhost:8120/docs while the application is running.
For exporting the open API spec:
poetry run python src/python_src/util/pull_api_documentation.pyImages are built and pushed to ECR using the build_and_push_to_ecr.yml workflow which is triggered in one of two ways:
- Automatically: when pushed when changes are pushed to the
mainbranch, which should only be done when a Pull Request is merged into themainbranch - Manually: by triggering the action in Github
This workflow is not triggered when changes are pushed to any branch other than the main branch.
The image is released to the VA Platform using the release.yml workflow which is triggered when a new image is pushed to ECR.
This workflow will deploy the latest image to the VA Platform automatically for the dev and staging environments.
The sandbox and prod environments must be deployed manually by triggering the action in Github and selecting the desired environment(s).
Note that manually triggering the deployment will deploy the most recent commit hash to the selected environment(s).