-
Notifications
You must be signed in to change notification settings - Fork 2.1k
CI: Handle errors with MacOS and transformers #2561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI: Handle errors with MacOS and transformers #2561
Conversation
Changes in transformers introduced 2 errors in the MacOS CI, which are handled in this PR. Context For context on why we use torch 2.2 for MacOS, check huggingface#2431. Unfortunately, as of today, the available GH workers for MacOS still haven't improved. Description The 1st error was introduced by huggingface/transformers#37785, which results in torch.load failing when using torch < 2.6. The 2nd error was introduced by huggingface/transformers#37919, which results in a DTensor import being triggered when calling save_pretrained, which fails with MacOS and torch 2.2 (possibly also later MacOS versions, I haven't checked). The proposed solution is to plug into pytest, intercept the test report, check for these specific errors, and turn them into skips. Alternative solutions The proposed solution is obviously an ugly hack. However, these are errors we cannot fix directly, as they're caused by a dependency and are caused by the old torch version we're forced to use (thus fixing them in transformers is probably not an option). Instead of altering the test report, the individual tests that fail could get an explicit skip marker when MacOS is detected. However, since the amount of affected tests are several hundreds, this is very impractical and leads to a lot of noise in the tests. Alternatively, we could move forward with the proposal in huggingface#2431 and remove MacOS completely from the CI. I do, however, still have the faint hope that GH will provide arm64 workers with more RAM in the future, allowing us to switch.
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
githubnemo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but let's see if #2564 turns out to be working - if it is, we can discuss either adding this fix or using the non-parallel variant with MacOS latest.
CI Handle error with MacOS and transformers A change in transformers introduced an error in the MacOS CI, which is handled in this PR. Context For context on why we use torch 2.2 for MacOS, check huggingface#2431. Unfortunately, as of today, the available GH workers for MacOS still haven't improved. Description The error was introduced by huggingface/transformers#37785, which results in torch.load failing when using torch < 2.6. The proposed solution is to plug into pytest, intercept the test report, check for the specific error, and mark the test as skipped instead. Alternative solutions The proposed solution is obviously an ugly hack. However, these are errors we cannot fix directly, as they're caused by a dependency and are caused by the old torch version we're forced to use (thus fixing them in transformers is probably not an option). Instead of altering the test report, the individual tests that fail could get an explicit skip marker when MacOS is detected. However, since the amount of affected tests are several hundreds, this is very impractical and leads to a lot of noise in the tests. Alternatively, we could move forward with the proposal in huggingface#2431 and remove MacOS completely from the CI. I do, however, still have the faint hope that GH will provide arm64 workers with more RAM in the future, allowing us to switch.
* better structure dpo config * fix tests * fix regex * add contributing guidelines
Changes in transformers introduced
2 errorsan error in the MacOS CI, which are handled in this PR.Edit: The second was fixed in huggingface/transformers#38496. It's still unreleased, hence the
DTensorimport errors in the Mac tests, but I think we can ignore those.Context
For context on why we use torch 2.2 for MacOS, check #2431. Unfortunately, as of today, the available GH workers for MacOS still haven't improved.
Description
The
1sterror was introduced byhuggingface/transformers#37785, which results in
torch.loadfailing when using torch < 2.6.The 2nd error was introduced byhuggingface/transformers#37919, which results in a
DTensorimport being triggered when callingsave_pretrained, which fails with MacOS and torch 2.2 (possibly also later MacOS versions, I haven't checked).The proposed solution is to plug into pytest, intercept the test report, check for these specific errors, and turn them into skips.
Alternative solutions
The proposed solution is obviously an ugly hack. However, these are errors we cannot fix directly, as they're caused by a dependency and are caused by the old torch version we're forced to use (thus fixing them in transformers is probably not an option).
Instead of altering the test report, the individual tests that fail could get an explicit skip marker when MacOS is detected. However, since the amount of affected tests are several hundreds, this is very impractical and leads to a lot of noise in the tests.
Alternatively, we could move forward with the proposal in #2431 and remove MacOS completely from the CI. I do, however, still have the faint hope that GH will provide arm64 workers with more RAM in the future, allowing us to switch.