feat: added huggingface-regressor #122#132
Conversation
|
✅ Supervised models have all passed validation. Metric,Value
Overall ACC,0.0
Overall RACCU,0.0030303030303030303
Overall RACC,0.0
Kappa,0.0
Gwet AC1,-0.0030395136778115636
Bennett S,-0.00303951367781155
Kappa Standard Error,0.0
Kappa Unbiased,-0.00303951367781155
Scott PI,-0.00303951367781155
Kappa No Prevalence,-1.0
Kappa 95% CI,"(0.0, 0.0)"
Standard Error,0.0
95% CI,"(0.0, 0.0)"
Chi-Squared,None
Phi-Squared,None
Cramer V,None
Response Entropy,7.366322214245807
Reference Entropy,7.366322214245807
Cross Entropy,0
Joint Entropy,7.366322214245807
Conditional Entropy,-0.0
Mutual Information,7.366322214245807
KL Divergence,None
Lambda B,1.0
Lambda A,1.0
Chi-Squared DF,108241
Overall J,"(0.0, 0.0)"
Hamming Loss,1.0
Zero-one Loss,165
NIR,0.006060606060606061
P-Value,1
Overall CEN,0.0
Overall MCEN,0.0
Overall MCC,0.0
RR,0.5
CBA,0.0
AUNU,None
AUNP,None
RCI,1.0
Pearson C,None
TPR Micro,0.0
TPR Macro,None
CSI,None
ARI,None
TNR Micro,0.9969604863221885
TNR Macro,0.996969696969697
Bangdiwala B,None
Krippendorff Alpha,0.0
SOA1(Landis & Koch),Slight
SOA2(Fleiss),Poor
SOA3(Altman),Poor
SOA4(Cicchetti),Poor
SOA5(Cramer),None
SOA6(Matthews),Negligible
SOA7(Lambda A),Perfect
SOA8(Lambda B),Perfect
SOA9(Krippendorff Alpha),Low
SOA10(Pearson C),None
FPR Macro,0.00303030303030305
FNR Macro,None
PPV Macro,None
NPV Macro,0.996969696969697
ACC Macro,0.9939393939393939
F1 Macro,0.0
FPR Micro,0.003039513677811523
FNR Micro,1.0
PPV Micro,0.0
F1 Micro,0.0
NPV Micro,0.9969604863221885
Spearman,0.011855849117089201✅ Zero-shot models have all passed validation. Metric,Value
Overall ACC,0.0
Overall RACCU,0.00010010001992985675
Overall RACC,0.0
Kappa,0.0
Gwet AC1,-0.00010010006100605498
Bennett S,-0.00010010010010010009
Kappa Standard Error,0.0
Kappa Unbiased,-0.00010011004094695072
Scott PI,-0.00010011004094695072
Kappa No Prevalence,-1.0
Kappa 95% CI,"(0.0, 0.0)"
Standard Error,0.0
95% CI,"(0.0, 0.0)"
Chi-Squared,None
Phi-Squared,None
Cramer V,None
Response Entropy,12.286157441352454
Reference Entropy,12.286549508613042
Cross Entropy,0
Joint Entropy,12.286549508613042
Conditional Entropy,-0.0
Mutual Information,12.286157441352454
KL Divergence,None
Lambda B,1.0
Lambda A,0.9997997997997998
Chi-Squared DF,99800100
Overall J,"(0.0, 0.0)"
Hamming Loss,0.9999999999999999
Zero-one Loss,4996
NIR,0.00020016012810248197
P-Value,1
Overall CEN,1.401066496965464e-05
Overall MCEN,1.401066496965464e-05
Overall MCC,0.0
RR,0.5000500450405365
CBA,0.0
AUNU,None
AUNP,None
RCI,0.9999680897179217
Pearson C,None
TPR Micro,0.0
TPR Macro,None
CSI,None
ARI,0.0
TNR Micro,0.9998998999380064
TNR Macro,0.9998999099189271
Bangdiwala B,None
Krippendorff Alpha,-1.9957876399588722e-08
SOA1(Landis & Koch),Slight
SOA2(Fleiss),Poor
SOA3(Altman),Poor
SOA4(Cicchetti),Poor
SOA5(Cramer),None
SOA6(Matthews),Negligible
SOA7(Lambda A),Very Strong
SOA8(Lambda B),Perfect
SOA9(Krippendorff Alpha),Low
SOA10(Pearson C),None
FPR Macro,0.00010009008107292328
FNR Macro,None
PPV Macro,None
NPV Macro,0.9998999099951021
ACC Macro,0.9997998199140291
F1 Macro,0.0
FPR Micro,0.0001001000619935688
FNR Micro,1.0
PPV Micro,0.0
F1 Micro,0.0
NPV Micro,0.9998998999380064
Spearman, |
|
dvc step of the CI is failing, I suspect due to how the prediction dataframe is structured, which causes to fail. This is unfortunately not apparent from the logs. @tintinrevient, do you know how I could check where the error happens? |
@florisvdf you can check here: https://github.com/ProteinGym/proteingym-benchmark/actions/runs/18523420137/job/52788481310 A reminder is that this repo is under refactoring, the ways to structure dvc.yaml will change this or next week. |
|
I'll review it tomorrow. (a bit wrapping up on other PRs.) |
| if Path(SageMakerTrainingJobPath.OUTPUT_PATH).is_dir(): | ||
| df.write_csv( | ||
| f"{SageMakerTrainingJobPath.OUTPUT_PATH}/{dataset.name}_{model_card.name}.csv" | ||
| ) | ||
|
|
||
| console.print( | ||
| f"Saved the metrics in CSV in {SageMakerTrainingJobPath.OUTPUT_PATH}/{dataset.name}_{model_card.name}.csv" | ||
| ) | ||
| else: | ||
| console.print(f"Predictions:\n {df}") |
There was a problem hiding this comment.
This path checking can be simplified to:
df.to_csv(
f"{SageMakerTrainingJobPath.OUTPUT_PATH}/{dataset.name}_{model_card.name}.csv",
index=False,
)
console.print(
f"Saved the metrics in CSV in {SageMakerTrainingJobPath.OUTPUT_PATH}/{dataset.name}_{model_card.name}.csv"
)The binding of paths are defined and used in DVC here: https://github.com/ProteinGym/proteingym-benchmark/blob/65bedcdb5f3286f2a17ef4abc6cc3cc78c528175/benchmark/supervised/local/dvc.yaml#L22C204-L22C250.
|
✅ Supervised models have all passed validation. Metric,Value
Overall ACC,0.0
Overall RACCU,0.0030303030303030303
Overall RACC,0.0
Kappa,0.0
Gwet AC1,-0.0030395136778115636
Bennett S,-0.00303951367781155
Kappa Standard Error,0.0
Kappa Unbiased,-0.00303951367781155
Scott PI,-0.00303951367781155
Kappa No Prevalence,-1.0
Kappa 95% CI,"(0.0, 0.0)"
Standard Error,0.0
95% CI,"(0.0, 0.0)"
Chi-Squared,None
Phi-Squared,None
Cramer V,None
Response Entropy,7.366322214245807
Reference Entropy,7.366322214245807
Cross Entropy,0
Joint Entropy,7.366322214245807
Conditional Entropy,-0.0
Mutual Information,7.366322214245807
KL Divergence,None
Lambda B,1.0
Lambda A,1.0
Chi-Squared DF,108241
Overall J,"(0.0, 0.0)"
Hamming Loss,1.0
Zero-one Loss,165
NIR,0.006060606060606061
P-Value,1
Overall CEN,0.0
Overall MCEN,0.0
Overall MCC,0.0
RR,0.5
CBA,0.0
AUNU,None
AUNP,None
RCI,1.0
Pearson C,None
TPR Micro,0.0
TPR Macro,None
CSI,None
ARI,None
TNR Micro,0.9969604863221885
TNR Macro,0.996969696969697
Bangdiwala B,None
Krippendorff Alpha,0.0
SOA1(Landis & Koch),Slight
SOA2(Fleiss),Poor
SOA3(Altman),Poor
SOA4(Cicchetti),Poor
SOA5(Cramer),None
SOA6(Matthews),Negligible
SOA7(Lambda A),Perfect
SOA8(Lambda B),Perfect
SOA9(Krippendorff Alpha),Low
SOA10(Pearson C),None
FPR Macro,0.00303030303030305
FNR Macro,None
PPV Macro,None
NPV Macro,0.996969696969697
ACC Macro,0.9939393939393939
F1 Macro,0.0
FPR Micro,0.003039513677811523
FNR Micro,1.0
PPV Micro,0.0
F1 Micro,0.0
NPV Micro,0.9969604863221885
Spearman,0.011855849117089201✅ Zero-shot models have all passed validation. Metric,Value
Overall ACC,0.0
Overall RACCU,0.00010007998789141575
Overall RACC,0.0
Kappa,0.0
Gwet AC1,-0.00010009004298543876
Bennett S,-0.00010009008107296567
Kappa Standard Error,0.0
Kappa Unbiased,-0.00010009000489789399
Scott PI,-0.00010009000489789399
Kappa No Prevalence,-1.0
Kappa 95% CI,"(0.0, 0.0)"
Standard Error,0.0
95% CI,"(0.0, 0.0)"
Chi-Squared,None
Phi-Squared,None
Cramer V,None
Response Entropy,12.286557761608659
Reference Entropy,12.286549508613042
Cross Entropy,0
Joint Entropy,12.286549508613042
Conditional Entropy,-0.0
Mutual Information,12.286557761608659
KL Divergence,None
Lambda B,1.0
Lambda A,1.0
Chi-Squared DF,99820081
Overall J,"(0.0, 0.0)"
Hamming Loss,0.9999999999999999
Zero-one Loss,4996
NIR,0.00020016012810248197
P-Value,1
Overall CEN,0.0
Overall MCEN,0.0
Overall MCC,0.0
RR,0.5
CBA,0.0
AUNU,None
AUNP,None
RCI,1.0000006717097922
Pearson C,None
TPR Micro,0.0
TPR Macro,None
CSI,None
ARI,None
TNR Micro,0.999899909957026
TNR Macro,0.9998999199359487
Bangdiwala B,None
Krippendorff Alpha,7.616744806910965e-11
SOA1(Landis & Koch),Slight
SOA2(Fleiss),Poor
SOA3(Altman),Poor
SOA4(Cicchetti),Poor
SOA5(Cramer),None
SOA6(Matthews),Negligible
SOA7(Lambda A),Perfect
SOA8(Lambda B),Perfect
SOA9(Krippendorff Alpha),Low
SOA10(Pearson C),None
FPR Macro,0.00010008006405126668
FNR Macro,None
PPV Macro,None
NPV Macro,0.9998999200121163
ACC Macro,0.999799839948065
F1 Macro,0.0
FPR Micro,0.00010009004297395485
FNR Micro,1.0
PPV Micro,0.0
F1 Micro,0.0
NPV Micro,0.999899909957026
Spearman, |
There was a problem hiding this comment.
@florisvdf GREAT WORK!!! It is a really neat PR. I saw it passed the CML (continuous machine learning) CI (currently, maybe the image name needs to be lowercase: https://github.com/ProteinGym/proteingym-benchmark/actions/runs/18556840699/job/52896246260)
I just add another comment for the paths in __main__.py entrypoint, and I've approved the PR. When the pipeline passes, you can merge it.
|
@florisvdf how is your experience using DVC, template model Dockerfile and everything? You can leave comments below. |
|
I see the docker fails to run, you can debug it locally to see what is the error message using dvc repro benchmark/supervised/local ... |
|
✅ Supervised models have all passed validation. Metric,Value
Overall ACC,0.0
Overall RACCU,0.0030303030303030303
Overall RACC,0.0
Kappa,0.0
Gwet AC1,-0.0030395136778115636
Bennett S,-0.00303951367781155
Kappa Standard Error,0.0
Kappa Unbiased,-0.00303951367781155
Scott PI,-0.00303951367781155
Kappa No Prevalence,-1.0
Kappa 95% CI,"(0.0, 0.0)"
Standard Error,0.0
95% CI,"(0.0, 0.0)"
Chi-Squared,None
Phi-Squared,None
Cramer V,None
Response Entropy,7.366322214245807
Reference Entropy,7.366322214245807
Cross Entropy,0
Joint Entropy,7.366322214245807
Conditional Entropy,-0.0
Mutual Information,7.366322214245807
KL Divergence,None
Lambda B,1.0
Lambda A,1.0
Chi-Squared DF,108241
Overall J,"(0.0, 0.0)"
Hamming Loss,1.0
Zero-one Loss,165
NIR,0.006060606060606061
P-Value,1
Overall CEN,0.0
Overall MCEN,0.0
Overall MCC,0.0
RR,0.5
CBA,0.0
AUNU,None
AUNP,None
RCI,1.0
Pearson C,None
TPR Micro,0.0
TPR Macro,None
CSI,None
ARI,None
TNR Micro,0.9969604863221885
TNR Macro,0.996969696969697
Bangdiwala B,None
Krippendorff Alpha,0.0
SOA1(Landis & Koch),Slight
SOA2(Fleiss),Poor
SOA3(Altman),Poor
SOA4(Cicchetti),Poor
SOA5(Cramer),None
SOA6(Matthews),Negligible
SOA7(Lambda A),Perfect
SOA8(Lambda B),Perfect
SOA9(Krippendorff Alpha),Low
SOA10(Pearson C),None
FPR Macro,0.00303030303030305
FNR Macro,None
PPV Macro,None
NPV Macro,0.996969696969697
ACC Macro,0.9939393939393939
F1 Macro,0.0
FPR Micro,0.003039513677811523
FNR Micro,1.0
PPV Micro,0.0
F1 Micro,0.0
NPV Micro,0.9969604863221885
Spearman,0.011855849117089201✅ Zero-shot models have all passed validation. Metric,Value
Overall ACC,0.0
Overall RACCU,0.00010007998789141575
Overall RACC,0.0
Kappa,0.0
Gwet AC1,-0.00010009004298543876
Bennett S,-0.00010009008107296567
Kappa Standard Error,0.0
Kappa Unbiased,-0.00010009000489789399
Scott PI,-0.00010009000489789399
Kappa No Prevalence,-1.0
Kappa 95% CI,"(0.0, 0.0)"
Standard Error,0.0
95% CI,"(0.0, 0.0)"
Chi-Squared,None
Phi-Squared,None
Cramer V,None
Response Entropy,12.286557761608659
Reference Entropy,12.286549508613042
Cross Entropy,0
Joint Entropy,12.286549508613042
Conditional Entropy,-0.0
Mutual Information,12.286557761608659
KL Divergence,None
Lambda B,1.0
Lambda A,1.0
Chi-Squared DF,99820081
Overall J,"(0.0, 0.0)"
Hamming Loss,0.9999999999999999
Zero-one Loss,4996
NIR,0.00020016012810248197
P-Value,1
Overall CEN,0.0
Overall MCEN,0.0
Overall MCC,0.0
RR,0.5
CBA,0.0
AUNU,None
AUNP,None
RCI,1.0000006717097922
Pearson C,None
TPR Micro,0.0
TPR Macro,None
CSI,None
ARI,None
TNR Micro,0.999899909957026
TNR Macro,0.9998999199359487
Bangdiwala B,None
Krippendorff Alpha,7.616744806910965e-11
SOA1(Landis & Koch),Slight
SOA2(Fleiss),Poor
SOA3(Altman),Poor
SOA4(Cicchetti),Poor
SOA5(Cramer),None
SOA6(Matthews),Negligible
SOA7(Lambda A),Perfect
SOA8(Lambda B),Perfect
SOA9(Krippendorff Alpha),Low
SOA10(Pearson C),None
FPR Macro,0.00010008006405126668
FNR Macro,None
PPV Macro,None
NPV Macro,0.9998999200121163
ACC Macro,0.999799839948065
F1 Macro,0.0
FPR Micro,0.00010009004297395485
FNR Micro,1.0
PPV Micro,0.0
F1 Micro,0.0
NPV Micro,0.999899909957026
Spearman, |
|
I just managed to pass the CI. I made an update in the model card and pushed it, but that didn't trigger the workflow so I had to run it manually (not sure why that happened?). I changed RITA_xl in the model card to RITA_s, which is a much smaller model. I suspect that the CI failed because a 2B parameter model was too large for the CI runner. Changing it to RITA_s worked, but it still took 30 min which I found a bit strange. I think we can merge now @tintinrevient. |
Overall very good. Your tip to run |
|
Not sure why workflow isn't triggered anymore when I push changes? |
It has merge conflict with the main branch, so it is not triggered? Because I've updated the main branch this morning. When the conflicts are resolved, we can merge. |
I've added to the backlog (@JCZuurmond): #135 |
Co-authored-by: Shushi <zhaobenben007@googlemail.com>
89d4375 to
fb24b4a
Compare
|
It seems that uv is no longer installed after rebasing to main branch: |
|
|
You can reference this cml.yaml: |
|
@florisvdf, I've updated the |
|
✅ Supervised models have all passed validation. metric_name,metric_value
Overall ACC,0.0
Overall RACCU,0.0030303030303030303
Overall RACC,0.0
Kappa,0.0
Gwet AC1,-0.0030395136778115636
Bennett S,-0.00303951367781155
Kappa Standard Error,0.0
Kappa Unbiased,-0.00303951367781155
Scott PI,-0.00303951367781155
Kappa No Prevalence,-1.0
Kappa 95% CI,"(0.0, 0.0)"
Standard Error,0.0
95% CI,"(0.0, 0.0)"
Chi-Squared,None
Phi-Squared,None
Cramer V,None
Response Entropy,7.366322214245807
Reference Entropy,7.366322214245807
Cross Entropy,0
Joint Entropy,7.366322214245807
Conditional Entropy,-0.0
Mutual Information,7.366322214245807
KL Divergence,None
Lambda B,1.0
Lambda A,1.0
Chi-Squared DF,108241
Overall J,"(0.0, 0.0)"
Hamming Loss,1.0
Zero-one Loss,165
NIR,0.006060606060606061
P-Value,1
Overall CEN,0.0
Overall MCEN,0.0
Overall MCC,0.0
RR,0.5
CBA,0.0
AUNU,None
AUNP,None
RCI,1.0
Pearson C,None
TPR Micro,0.0
TPR Macro,None
CSI,None
ARI,None
TNR Micro,0.9969604863221885
TNR Macro,0.996969696969697
Bangdiwala B,None
Krippendorff Alpha,0.0
SOA1(Landis & Koch),Slight
SOA2(Fleiss),Poor
SOA3(Altman),Poor
SOA4(Cicchetti),Poor
SOA5(Cramer),None
SOA6(Matthews),Negligible
SOA7(Lambda A),Perfect
SOA8(Lambda B),Perfect
SOA9(Krippendorff Alpha),Low
SOA10(Pearson C),None
FPR Macro,0.00303030303030305
FNR Macro,None
PPV Macro,None
NPV Macro,0.996969696969697
ACC Macro,0.9939393939393939
F1 Macro,0.0
FPR Micro,0.003039513677811523
FNR Micro,1.0
PPV Micro,0.0
F1 Micro,0.0
NPV Micro,0.9969604863221885✅ Zero-shot models have all passed validation. metric_name,metric_value
Overall ACC,0.0
Overall RACCU,0.00010350554262465215
Overall RACC,0.0
Kappa,0.0
Gwet AC1,-0.00010027040554341335
Bennett S,-0.0001002707309736288
Kappa Standard Error,0.0
Kappa Unbiased,-0.00010351625713101698
Scott PI,-0.00010351625713101698
Kappa No Prevalence,-1.0
Kappa 95% CI,"(0.0, 0.0)"
Standard Error,0.0
95% CI,"(0.0, 0.0)"
Chi-Squared,None
Phi-Squared,None
Cramer V,None
Response Entropy,12.286557761608659
Reference Entropy,12.270402713018697
Cross Entropy,0
Joint Entropy,12.286557761608659
Conditional Entropy,0.0161550485899576
Mutual Information,12.2704027130187
KL Divergence,None
Lambda B,0.9963963963963964
Lambda A,1.0
Chi-Squared DF,99460729
Overall J,"(0.0, 0.0)"
Hamming Loss,0.9999999999999999
Zero-one Loss,4996
NIR,0.0038030424339471577
P-Value,1
Overall CEN,0.0005655020094343343
Overall MCEN,0.0005655020094343343
Overall MCC,0.0
RR,0.5009023460998596
CBA,0.0
AUNU,None
AUNP,None
RCI,1.0000000000000002
Pearson C,None
TPR Micro,0.0
TPR Macro,None
CSI,None
ARI,0.0
TNR Micro,0.9998997292690264
TNR Macro,0.9998997393222379
Bangdiwala B,None
Krippendorff Alpha,-3.425833166131969e-06
SOA1(Landis & Koch),Slight
SOA2(Fleiss),Poor
SOA3(Altman),Poor
SOA4(Cicchetti),Poor
SOA5(Cramer),None
SOA6(Matthews),Negligible
SOA7(Lambda A),Perfect
SOA8(Lambda B),Very Strong
SOA9(Krippendorff Alpha),Low
SOA10(Pearson C),None
FPR Macro,0.00010026067776214287
FNR Macro,None
PPV Macro,None
NPV Macro,0.9998997393222379
ACC Macro,0.9997994786444756
F1 Macro,0.0
FPR Micro,0.00010027073097362837
FNR Micro,1.0
PPV Micro,0.0
F1 Micro,0.0
NPV Micro,0.9998997292690264 |
|
It takes 15 minutes to run 3 models against 3 datasets, pairwise, roughly. |
Thanks for helping me along wrapping this up! |




Changes
Resolves #122
Please include a summary of the changes and the related issue. Please also
include relevant motivation and context. List any dependencies that are required
for this change.
Summary
Added RITA regressor implemented in proteingym.models.hfregressor. This module could easily be extended to use other huggingface hosted PLMs by implementing a dedicated Embedder class to
models/huggingface-regressor/src/proteingym/models/hfregressor/embedders, and by updating the model card to specifiy the name of the PLM. There is currently no support yet for extra features and using precomputed embeddings.Checklist