The models are included in the models folder, where each model occupies a subfolder as its repo.
A model repo contains its README.md as a model card, which comes in two parts:
- Metadata, which is a YAML section at the top, i.e., front matter.
- Text descriptions, which is a Markdown file, including summary and descriptions of the model.
For more information, you can reference Hugging Face's model cards.
The datasets are included in the dataset folder, where each dataset goes into a subfolder.
In order to build the archived file for each dataset, proteingym-base is used.
You can reference this guide to build the archived dataset.
The benchmark is defined in the benchmark folder, where there exist two games: supervised and zero-shot.
There are two games to benchmark: supervised and zero-shot. Each game has its selected list of models and datasets defined in dvc.yaml.
The models and datasets are defined in vars at the top, and DVC translates vars into a matrix, which is namely a loop defined as the following pseudo-code:
for dataset in datasets:
for model in models:
predict()
for dataset in datasets:
for model in models:
calculate_metric()You can benchmark a group of supervised models:
dvc repro benchmark/supervised/local/dvc.yaml --single-itemYou can benchmark a group of zero-shot models:
dvc repro benchmark/zero_shot/local/dvc.yaml --single-itemNote
Based on https://dvc.org/doc/command-reference/repro#-s, --single-item turns off the recursive search for all dvc.yaml changed dependencies. Only the current executed dvc.yaml will be searched.
There are two environments in which to run benchmark: one is the local environment, the other is the AWS environment.
The difference of the AWS environment is that:
- You need to upload the dataset and model files to S3.
- You need to build and push your Docker image to ECR.
- You need to use SageMaker training job to either train or score a model.
Important
In order to use the AWS environment, you need to set up your AWS profile with the below steps:
- Execute
aws configure sso. - Fill in the required fields, especially: "Default client Region" is "us-east-1".
a. SSO session name:
pg2benchmark. b. SSO start URL: https://d-90674355f1.awsapps.com/start c. SSO region:us-east-1. d. SSO registration scopes: Leave empty. e. Login via browser. - Select the account:
ifflabdev. a. Default client Region isus-east-1. b. CLI default ouptut: Leave empty. c. Profile name:pg2benchmark. - You can find your account ID and profile by executing
cat ~/.aws/config. - Finally, you can run
dvc reprowith environment variables in each game:AWS_ACCOUNT_ID=xxx AWS_PROFILE=yyy dvc repro
You can benchmark a group of supervised models:
AWS_ACCOUNT_ID=xxx AWS_PROFILE=yyy dvc repro benchmark/supervised/aws/dvc.yaml --single-itemYou can benchmark a group of zero-shot models:
AWS_ACCOUNT_ID=xxx AWS_PROFILE=yyy dvc repro benchmark/zero_shot/aws/dvc.yaml --single-item