Skip to content

Add AWS DVC file#74

Closed
tintinrevient wants to merge 18 commits intomainfrom
feat/add-dvc-aws-yaml
Closed

Add AWS DVC file#74
tintinrevient wants to merge 18 commits intomainfrom
feat/add-dvc-aws-yaml

Conversation

@tintinrevient
Copy link
Copy Markdown
Contributor

@tintinrevient tintinrevient commented Jul 22, 2025

This PR resolves #8 and #52

The major TODOs:

  • Use ZIP file for each dataset and load them by Dataset.from_path()
  • hyper parameters can be passed in S3 prefix
  • Finally, in train(), dataset.toml is not needed.

@tintinrevient tintinrevient requested a review from JCZuurmond July 22, 2025 09:09
Comment thread README.md Outdated
Comment thread supervised/dvc.yaml Outdated
Comment thread supervised/dvc.yaml
Comment thread supervised/dvc.yaml Outdated
cmd:
- aws ecr describe-repositories --repository-names ${item.model.name} --region ${aws.region_name} >/dev/null 2>&1 || aws ecr create-repository --repository-name ${item.model.name} --region ${aws.region_name} >/dev/null
- aws ecr get-login-password --region ${aws.region_name} | docker login --username AWS --password-stdin ${aws.account_id}.dkr.ecr.${aws.region_name}.amazonaws.com
- docker buildx build --build-arg GIT_CACHE_BUST=${local.git_cache_bust} --platform linux/amd64,linux/arm64 --secret id=git_auth,src=git-auth.txt -t ${aws.account_id}.dkr.ecr.${aws.region_name}.amazonaws.com/${item.model.name}:latest ${item.model.dockerfile} --push
Copy link
Copy Markdown
Contributor

@karel-w karel-w Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense that cmd here could be pg2-benchmark aws upload? For local we perform building in pg2-benchmark model predict

Comment thread zero_shot/dvc.yaml Outdated
Comment thread src/pg2_benchmark/cli/aws.py Outdated
Comment thread supervised/dvc.yaml Outdated

upload_to_s3:
cmd:
- aws s3 cp ${local.data_dir}/ s3://${aws.s3_training_data_prefix}/${local.data_dir}/ --recursive --exclude ".*" --exclude "*/.*"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check if they exist before cp

Comment thread supervised/dvc.yaml Outdated
upload_to_s3:
cmd:
- aws s3 cp ${local.data_dir}/ s3://${aws.s3_training_data_prefix}/${local.data_dir}/ --recursive --exclude ".*" --exclude "*/.*"
- aws s3 cp ${local.model_dir}/ s3://${aws.s3_training_data_prefix}/${local.model_dir}/ --recursive --exclude ".*" --exclude "*/.*"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add content in the hyperparams, so remove this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not nice to remove model.toml in 2 reasons:

  • Manifest file belongs to each model case by case, so model's Manifest file should not be used in pg2-benchmark to extend into hyper-parameters.

  • In AWS SageMaker's training job, when passing TOML as hyper-parameters, the typed key-value pair will all become (string-to-string) mapping, so it is better to load model's manifest in each model's method.

@tintinrevient tintinrevient changed the base branch from refactor/build-docker-images-with-configured-paths to main July 24, 2025 16:18
@tintinrevient
Copy link
Copy Markdown
Contributor Author

✅ Supervised models have all passed validation.

Metric,Value
Overall ACC,0.0
Overall RACCU,0.005050505050505051
Overall RACC,0.0
Kappa,0.0
Gwet AC1,-0.005076142131979714
Bennett S,-0.005076142131979696
Kappa Standard Error,0.0
Kappa Unbiased,-0.005076142131979696
Scott PI,-0.005076142131979696
Kappa No Prevalence,-1.0
Kappa 95% CI,"(0.0, 0.0)"
Standard Error,0.0
95% CI,"(0.0, 0.0)"
Chi-Squared,None
Phi-Squared,None
Cramer V,None
Response Entropy,6.62935662007962
Reference Entropy,6.62935662007962
Cross Entropy,0
Joint Entropy,6.62935662007962
Conditional Entropy,-0.0
Mutual Information,6.62935662007962
KL Divergence,None
Lambda B,1.0
Lambda A,1.0
Chi-Squared DF,38809
Overall J,"(0.0, 0.0)"
Hamming Loss,1.0
Zero-one Loss,99
NIR,0.010101010101010102
P-Value,1
Overall CEN,0.0
Overall MCEN,0.0
Overall MCC,0.0
RR,0.5
CBA,0.0
AUNU,None
AUNP,None
RCI,1.0
Pearson C,None
TPR Micro,0.0
TPR Macro,None
CSI,None
ARI,None
TNR Micro,0.9949238578680203
TNR Macro,0.9949494949494949
Bangdiwala B,None
Krippendorff Alpha,0.0
SOA1(Landis & Koch),Slight
SOA2(Fleiss),Poor
SOA3(Altman),Poor
SOA4(Cicchetti),Poor
SOA5(Cramer),None
SOA6(Matthews),Negligible
SOA7(Lambda A),Perfect
SOA8(Lambda B),Perfect
SOA9(Krippendorff Alpha),Low
SOA10(Pearson C),None
FPR Macro,0.005050505050505083
FNR Macro,None
PPV Macro,None
NPV Macro,0.9949494949494949
ACC Macro,0.98989898989899
F1 Macro,0.0
FPR Micro,0.005076142131979711
FNR Micro,1.0
PPV Micro,0.0
F1 Micro,0.0
NPV Micro,0.9949238578680203
Spearman,0.5496969696969696

✅ Zero-shot models have all passed validation.

Metric,Value
Overall ACC,0.0
Overall RACCU,0.00010007998789141575
Overall RACC,0.0
Kappa,0.0
Gwet AC1,-0.00010009004298543876
Bennett S,-0.00010009008107296567
Kappa Standard Error,0.0
Kappa Unbiased,-0.00010009000489789399
Scott PI,-0.00010009000489789399
Kappa No Prevalence,-1.0
Kappa 95% CI,"(0.0, 0.0)"
Standard Error,0.0
95% CI,"(0.0, 0.0)"
Chi-Squared,None
Phi-Squared,None
Cramer V,None
Response Entropy,12.286557761608659
Reference Entropy,12.286549508613042
Cross Entropy,0
Joint Entropy,12.286549508613042
Conditional Entropy,-0.0
Mutual Information,12.286557761608659
KL Divergence,None
Lambda B,1.0
Lambda A,1.0
Chi-Squared DF,99820081
Overall J,"(0.0, 0.0)"
Hamming Loss,0.9999999999999999
Zero-one Loss,4996
NIR,0.00020016012810248197
P-Value,1
Overall CEN,0.0
Overall MCEN,0.0
Overall MCC,0.0
RR,0.5
CBA,0.0
AUNU,None
AUNP,None
RCI,1.0000006717097922
Pearson C,None
TPR Micro,0.0
TPR Macro,None
CSI,None
ARI,None
TNR Micro,0.999899909957026
TNR Macro,0.9998999199359487
Bangdiwala B,None
Krippendorff Alpha,7.616744806910965e-11
SOA1(Landis & Koch),Slight
SOA2(Fleiss),Poor
SOA3(Altman),Poor
SOA4(Cicchetti),Poor
SOA5(Cramer),None
SOA6(Matthews),Negligible
SOA7(Lambda A),Perfect
SOA8(Lambda B),Perfect
SOA9(Krippendorff Alpha),Low
SOA10(Pearson C),None
FPR Macro,0.00010008006405126668
FNR Macro,None
PPV Macro,None
NPV Macro,0.9998999200121163
ACC Macro,0.999799839948065
F1 Macro,0.0
FPR Micro,0.00010009004297395485
FNR Micro,1.0
PPV Micro,0.0
F1 Micro,0.0
NPV Micro,0.999899909957026
Spearman,

@tintinrevient
Copy link
Copy Markdown
Contributor Author

✅ Supervised models have all passed validation.

Metric,Value
Overall ACC,0.0
Overall RACCU,0.005050505050505051
Overall RACC,0.0
Kappa,0.0
Gwet AC1,-0.005076142131979714
Bennett S,-0.005076142131979696
Kappa Standard Error,0.0
Kappa Unbiased,-0.005076142131979696
Scott PI,-0.005076142131979696
Kappa No Prevalence,-1.0
Kappa 95% CI,"(0.0, 0.0)"
Standard Error,0.0
95% CI,"(0.0, 0.0)"
Chi-Squared,None
Phi-Squared,None
Cramer V,None
Response Entropy,6.62935662007962
Reference Entropy,6.62935662007962
Cross Entropy,0
Joint Entropy,6.62935662007962
Conditional Entropy,-0.0
Mutual Information,6.62935662007962
KL Divergence,None
Lambda B,1.0
Lambda A,1.0
Chi-Squared DF,38809
Overall J,"(0.0, 0.0)"
Hamming Loss,1.0
Zero-one Loss,99
NIR,0.010101010101010102
P-Value,1
Overall CEN,0.0
Overall MCEN,0.0
Overall MCC,0.0
RR,0.5
CBA,0.0
AUNU,None
AUNP,None
RCI,1.0
Pearson C,None
TPR Micro,0.0
TPR Macro,None
CSI,None
ARI,None
TNR Micro,0.9949238578680203
TNR Macro,0.9949494949494949
Bangdiwala B,None
Krippendorff Alpha,0.0
SOA1(Landis & Koch),Slight
SOA2(Fleiss),Poor
SOA3(Altman),Poor
SOA4(Cicchetti),Poor
SOA5(Cramer),None
SOA6(Matthews),Negligible
SOA7(Lambda A),Perfect
SOA8(Lambda B),Perfect
SOA9(Krippendorff Alpha),Low
SOA10(Pearson C),None
FPR Macro,0.005050505050505083
FNR Macro,None
PPV Macro,None
NPV Macro,0.9949494949494949
ACC Macro,0.98989898989899
F1 Macro,0.0
FPR Micro,0.005076142131979711
FNR Micro,1.0
PPV Micro,0.0
F1 Micro,0.0
NPV Micro,0.9949238578680203
Spearman,0.6528385899814472

✅ Zero-shot models have all passed validation.

Metric,Value
Overall ACC,0.0
Overall RACCU,0.00010007998789141575
Overall RACC,0.0
Kappa,0.0
Gwet AC1,-0.00010009004298543876
Bennett S,-0.00010009008107296567
Kappa Standard Error,0.0
Kappa Unbiased,-0.00010009000489789399
Scott PI,-0.00010009000489789399
Kappa No Prevalence,-1.0
Kappa 95% CI,"(0.0, 0.0)"
Standard Error,0.0
95% CI,"(0.0, 0.0)"
Chi-Squared,None
Phi-Squared,None
Cramer V,None
Response Entropy,12.286557761608659
Reference Entropy,12.286549508613042
Cross Entropy,0
Joint Entropy,12.286549508613042
Conditional Entropy,-0.0
Mutual Information,12.286557761608659
KL Divergence,None
Lambda B,1.0
Lambda A,1.0
Chi-Squared DF,99820081
Overall J,"(0.0, 0.0)"
Hamming Loss,0.9999999999999999
Zero-one Loss,4996
NIR,0.00020016012810248197
P-Value,1
Overall CEN,0.0
Overall MCEN,0.0
Overall MCC,0.0
RR,0.5
CBA,0.0
AUNU,None
AUNP,None
RCI,1.0000006717097922
Pearson C,None
TPR Micro,0.0
TPR Macro,None
CSI,None
ARI,None
TNR Micro,0.999899909957026
TNR Macro,0.9998999199359487
Bangdiwala B,None
Krippendorff Alpha,7.616744806910965e-11
SOA1(Landis & Koch),Slight
SOA2(Fleiss),Poor
SOA3(Altman),Poor
SOA4(Cicchetti),Poor
SOA5(Cramer),None
SOA6(Matthews),Negligible
SOA7(Lambda A),Perfect
SOA8(Lambda B),Perfect
SOA9(Krippendorff Alpha),Low
SOA10(Pearson C),None
FPR Macro,0.00010008006405126668
FNR Macro,None
PPV Macro,None
NPV Macro,0.9998999200121163
ACC Macro,0.999799839948065
F1 Macro,0.0
FPR Micro,0.00010009004297395485
FNR Micro,1.0
PPV Micro,0.0
F1 Micro,0.0
NPV Micro,0.999899909957026
Spearman,

@tintinrevient
Copy link
Copy Markdown
Contributor Author

✅ Supervised models have all passed validation.

Metric,Value
Overall ACC,0.0
Overall RACCU,0.005050505050505051
Overall RACC,0.0
Kappa,0.0
Gwet AC1,-0.005076142131979714
Bennett S,-0.005076142131979696
Kappa Standard Error,0.0
Kappa Unbiased,-0.005076142131979696
Scott PI,-0.005076142131979696
Kappa No Prevalence,-1.0
Kappa 95% CI,"(0.0, 0.0)"
Standard Error,0.0
95% CI,"(0.0, 0.0)"
Chi-Squared,None
Phi-Squared,None
Cramer V,None
Response Entropy,6.62935662007962
Reference Entropy,6.62935662007962
Cross Entropy,0
Joint Entropy,6.62935662007962
Conditional Entropy,-0.0
Mutual Information,6.62935662007962
KL Divergence,None
Lambda B,1.0
Lambda A,1.0
Chi-Squared DF,38809
Overall J,"(0.0, 0.0)"
Hamming Loss,1.0
Zero-one Loss,99
NIR,0.010101010101010102
P-Value,1
Overall CEN,0.0
Overall MCEN,0.0
Overall MCC,0.0
RR,0.5
CBA,0.0
AUNU,None
AUNP,None
RCI,1.0
Pearson C,None
TPR Micro,0.0
TPR Macro,None
CSI,None
ARI,None
TNR Micro,0.9949238578680203
TNR Macro,0.9949494949494949
Bangdiwala B,None
Krippendorff Alpha,0.0
SOA1(Landis & Koch),Slight
SOA2(Fleiss),Poor
SOA3(Altman),Poor
SOA4(Cicchetti),Poor
SOA5(Cramer),None
SOA6(Matthews),Negligible
SOA7(Lambda A),Perfect
SOA8(Lambda B),Perfect
SOA9(Krippendorff Alpha),Low
SOA10(Pearson C),None
FPR Macro,0.005050505050505083
FNR Macro,None
PPV Macro,None
NPV Macro,0.9949494949494949
ACC Macro,0.98989898989899
F1 Macro,0.0
FPR Micro,0.005076142131979711
FNR Micro,1.0
PPV Micro,0.0
F1 Micro,0.0
NPV Micro,0.9949238578680203
Spearman,0.6654792826221397

✅ Zero-shot models have all passed validation.

Metric,Value
Overall ACC,0.0
Overall RACCU,0.00010007998789141575
Overall RACC,0.0
Kappa,0.0
Gwet AC1,-0.00010009004298543876
Bennett S,-0.00010009008107296567
Kappa Standard Error,0.0
Kappa Unbiased,-0.00010009000489789399
Scott PI,-0.00010009000489789399
Kappa No Prevalence,-1.0
Kappa 95% CI,"(0.0, 0.0)"
Standard Error,0.0
95% CI,"(0.0, 0.0)"
Chi-Squared,None
Phi-Squared,None
Cramer V,None
Response Entropy,12.286557761608659
Reference Entropy,12.286549508613042
Cross Entropy,0
Joint Entropy,12.286549508613042
Conditional Entropy,-0.0
Mutual Information,12.286557761608659
KL Divergence,None
Lambda B,1.0
Lambda A,1.0
Chi-Squared DF,99820081
Overall J,"(0.0, 0.0)"
Hamming Loss,0.9999999999999999
Zero-one Loss,4996
NIR,0.00020016012810248197
P-Value,1
Overall CEN,0.0
Overall MCEN,0.0
Overall MCC,0.0
RR,0.5
CBA,0.0
AUNU,None
AUNP,None
RCI,1.0000006717097922
Pearson C,None
TPR Micro,0.0
TPR Macro,None
CSI,None
ARI,None
TNR Micro,0.999899909957026
TNR Macro,0.9998999199359487
Bangdiwala B,None
Krippendorff Alpha,7.616744806910965e-11
SOA1(Landis & Koch),Slight
SOA2(Fleiss),Poor
SOA3(Altman),Poor
SOA4(Cicchetti),Poor
SOA5(Cramer),None
SOA6(Matthews),Negligible
SOA7(Lambda A),Perfect
SOA8(Lambda B),Perfect
SOA9(Krippendorff Alpha),Low
SOA10(Pearson C),None
FPR Macro,0.00010008006405126668
FNR Macro,None
PPV Macro,None
NPV Macro,0.9998999200121163
ACC Macro,0.999799839948065
F1 Macro,0.0
FPR Micro,0.00010009004297395485
FNR Micro,1.0
PPV Micro,0.0
F1 Micro,0.0
NPV Micro,0.999899909957026
Spearman,

Copy link
Copy Markdown
Contributor

@JCZuurmond JCZuurmond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tintinrevient : I did a first review. Could you split the PR? It contains many changes. Please create a PR for:

  • Introducing AWS
  • Introducing the Dataset.from_path
  • Changing the model Manifest
  • And maybe more, I did not get further

Comment thread models/esm/src/pg2_model_esm/__main__.py
model_toml_file: str = typer.Option(help="Path to the model TOML file"),
nogpu: bool = typer.Option(False, help="GPUs available"),
dataset_zip_file: str = typer.Option(
default="", help="Path to the dataset ZIP file"
Copy link
Copy Markdown
Contributor

@JCZuurmond JCZuurmond Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option is required, right? Also, could you update the syntax to the annotated version where option is at the left side of the equals? And update both types to Path

Suggested change
default="", help="Path to the dataset ZIP file"
help="Path to the dataset ZIP file"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The option is not required. because in AWS, there are no paths passed from a user input to use a local file path.

manifest = Manifest.from_path(dataset_toml_file)
dataset_name = manifest.name
dataset = manifest.ingest()
dataset_zip_file = dataset_zip_file or training_data_path
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why introduce this or?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in AWS environment, there is no dataset_file passed by a user, and SageMaker training job automatically mounted the S3 path in the fixed location inside the container.

Comment thread models/esm/src/pg2_model_esm/__main__.py
dataset = Dataset.from_path(dataset_zip_file)
dataset_name = dataset.name

model_toml_file = model_toml_file or manifest_path
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar question about the or statement

Comment thread models/esm/manifest.toml
Comment thread models/esm/src/pg2_model_esm/manifest.py
@@ -1,10 +1,10 @@
import polars as pl
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comments as for the other script

import toml


class Manifest(BaseModel):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The manifest should probably go into the pg2-benchmark package

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've put them into the pg2-benchmark! It is a good point, as in the future, we will update it with model cards, so it is sensible to put it in pg2-benchmark, 🤔

Comment thread models/pls/src/pg2_model_pls/utils.py
@tintinrevient tintinrevient marked this pull request as draft July 29, 2025 16:29
@tintinrevient tintinrevient deleted the feat/add-dvc-aws-yaml branch August 28, 2025 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Check AWS SageMaker to scale benchmark -> one assay (= one ML experiment) per dataset

3 participants