Skip to content

Latest commit

 

History

History
39 lines (25 loc) · 868 Bytes

File metadata and controls

39 lines (25 loc) · 868 Bytes

Generate training data for crime entity extraction

Basic Usage

python scenarios.py
python json.py
python format.py
python hugging_face.py

Details

Scenarios

Generate crime scenarios using the scenarios.py script. Modify the SYSTEM_MESSAGE variable to change the prompt.

JSON

Extract entities from the scenarios using the json.py script.

Format

Format the JSON output into a JSONL file using the format.py script.

Hugging Face

Upload the JSONL file to Hugging Face using the hugging_face.py script.

Prerequisites

  • python
  • pip
  • huggingface-cli
  • boto3
  • datasets
  • huggingface_hub

You should have an AWS account with the necessary permissions to use Amazon Bedrock. You should have authenticated with AWS SSO using the aws configure command. Alternatively, you can use another LLM provider.