Description
Hey I am trying to pull the model from huggingface repo using
AutoModelForMaskedLM.from_pretrained( 'mosaicml/mosaic-bert-base-seqlen-2048', trust_remote_code=True, revision='b7a0389')
(with revision param and without) I am getting the same error that goes like this:
ValueError: The model class you are passing has a
config_classattribute that is not consistent with the config class you passed (model has <class 'transformers.models.bert.configuration_bert.BertConfig'> and you passed <class 'transformers_modules.mosaicml.mosaic-bert-base-seqlen-2048.b7a0389deadf7a7261a3e5e7ea0680d8ba12232f.configuration_bert.BertConfig'>. Fix one of those so they match!
Do you have any suggestion as to why this might be the case?
When I do this : BertModel.from_pretrained('mosaicml/mosaic-bert-base-seqlen-2048')
It seem to work correctly although I am not sure if the flash attention will work correctly given this statement "This model requires that trust_remote_code=True be passed to the from_pretrained method. This is because we train using FlashAttention (Dao et al. 2022), which is not part of the transformers library and depends on Triton and some custom PyTorch code." in the model card, and class BertModel don't have parameter trust_remote_code.