Skip to content

[Inf2] Add Optimum Neuron support for Encoder models #73

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Jul 4, 2024

Conversation

philschmid
Copy link
Contributor

What does this PR do?

This PR adds support for Optimum Neuron with encoder model tasks in the Hugging Face Inference Toolkit. This allows us to enable Inferentia instances for all supported models and tasks including:

Supported Encoder NeuronX Architectures:

  • albert
  • bert
  • camembert
  • convbert
  • deberta
  • deberta-v2
  • distilbert
  • electra
  • roberta
  • mobilebert
  • mpnet
  • vit
  • xlm
  • xlm-roberta

Supported NeuronX Tasks:

  • text-classification
  • text-generation
  • token-classification
  • fill-mask
  • question-answering
  • feature-extraction

The CI doesn't support Running Inferentia2 tests there for i ran it locally.
Bildschirmfoto 2024-07-01 um 16 17 59
Bildschirmfoto 2024-07-01 um 16 15 24

@philschmid philschmid requested a review from oOraph July 1, 2024 14:23
@@ -0,0 +1,122 @@
# Build based on https://github.com/aws/deep-learning-containers/blob/master/huggingface/pytorch/inference/docker/2.1/py3/sdk2.18.0/Dockerfile.neuronx
FROM ubuntu:20.04
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FROM ubuntu:22.04 or 24:04 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its what AWS uses. We should stick to the image the use. Using 22.04 lead to errors in the past with neuron.

@oOraph oOraph self-requested a review July 1, 2024 15:16
@philschmid philschmid merged commit b551223 into main Jul 4, 2024
6 checks passed
@philschmid philschmid deleted the add-inf2-support branch July 4, 2024 06:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants