Add decoder modeling by mht-sharma · Pull Request #108 · huggingface/optimum-amd

mht-sharma · 2024-03-13T14:18:58Z

As per title!

Focus on OPT, Llama, Mistral
Add logits comparison tests with cpu runner
Adds notification on slack
Docs
Fix tests: Tests fail because of issue 118

Example Usage:

from optimum.amd.ryzenai import RyzenAIModelForCausalLM
from transformers import AutoTokenizer
from tests.ryzenai.testing_utils import DEFAULT_VAIP_CONFIG_TRANSFORMERS

model_path = # OPT/LLama model quantized using Brevitas
vaip_config = DEFAULT_VAIP_CONFIG_TRANSFORMERS
model = RyzenAIModelForCausalLM.from_pretrained(model_path, vaip_config=vaip_config)
tokenizer= AutoTokenizer.from_pretrained(model_path)

prompt = "Hey, are you conscious? Can you talk to me?"
inputs = tokenizer(prompt, return_tensors="pt")

generated_text = model.generate(**inputs, max_new_tokens=30, do_sample=False)
print(generated_text)

tests/ryzenai/rewriter.py

mht-sharma · 2024-03-19T13:17:34Z

tests/ryzenai/vaip_config_transformers.json

Would it make sense to have this configs somewhere else and use them by default in case user does not provide a config?
Maybe in optimum/amd/default_cfgs/
Could ask this in the meeting tomorrow

fxmarty

Great work!

fxmarty · 2024-03-26T13:40:17Z

optimum/amd/ryzenai/modeling.py

+        # is_dynamic = RyzenAIModel._check_uses_static_shape(path)
+        # if is_dynamic and provider == "VitisAIExecutionProvider":
+        #     raise ValueError(
+        #         "The model provided has dynamic axes in input/output. Please provide model with static shapes for inference with RyzenAI."
+        #     )


Should this be removed? Why is it commented out?

In their documentation, its mentioned

For CNN’s on NPU platform, dynamic input shapes are currently not supported and only a batch size of 1 is allowed. Please ensure that the shape of input is a fixed value, and the batch dimension is set to 1.

But since, LLMs support dynamic shapes had to comment it. Shifted the error to image-classification model as we do not support transformers model there yet.

fxmarty · 2024-03-26T13:42:03Z

optimum/amd/ryzenai/modeling.py

+        if config:
+            self.model_type = config.model_type


config is hinted as PretrainedConfig, not Optional[PretrainedConfig] so I am not sure about the controflow here.

In general for simplicity, I think we should probably avoid dynamically defined instance attributes (i.e., all RyzenAIModel instances should (or should not) have a model_type attribute.

I guess I added it because some of their pretrained models do not have config. So I made two changes now:

Made config a hard requirement for RyzenAIModelForCausalLM

Made it optional in RyzenAIModel

fxmarty · 2024-03-26T13:42:54Z

optimum/amd/ryzenai/modeling.py


    @classmethod
-    def _from_pretrained(
+    def _load_model_and_processors(


Maybe we discussed this already but why the renaming?

This method is not actually renamed, but a new method is created. I could reuse the _load_model_and_processors in _from_pretrained of CausalLM class. The UI is making it look like its renamed.

optimum/amd/ryzenai/modeling_decoder.py

fxmarty · 2024-03-26T13:45:13Z

optimum/amd/ryzenai/modeling_decoder.py

+        self.use_fp16 = False
+        for inp in model.get_inputs():
+            if (
+                inp.name == "past_key_values" or inp.name in self.key_value_input_names
+            ) and inp.type == "tensor(float16)":
+                self.use_fp16 = True
+                break


fp16 is not and never will be supported no? We could probably remove fp16 related code if so

I did see some references of fp16 in their quantizer, but nothing concrete on device inference. So will remove it.

optimum/amd/ryzenai/utils.py

fxmarty · 2024-03-26T13:49:56Z

optimum/amd/ryzenai/utils.py

+            "RYZENAI_SW_PATH environment variable is not set. Attempting to clone RyzenAI-SW repository now...\n"
+        )
+        ryzenai_sw_path = normalize_path(os.path.join(os.getcwd(), "RyzenAI-SW"))
+        clone_repository("https://github.com/amd/RyzenAI-SW/", ryzenai_sw_path)


Couldn't we instead have https://github.com/amd/RyzenAI-SW as a git submodule with a pinned commit (in order to avoid breaking changes)?

No strong opinion here, could anyway revisit later. Still probably a good idea to pin the commit.

I am using a specific commit after clone to avoid breaking changes. Added the RYZEN_SW_COMMIT_HASH at top of file.

fxmarty · 2024-03-26T13:50:40Z

tests/ryzenai/test_modeling.py

+        "opt",
+        "llama",
+        "mistral",


gpt bigcode not here?

Added gpt bigcode, there are other models also which should be supported without a change. Would add them in other PR after full testing.

fxmarty · 2024-03-26T13:52:25Z

tests/ryzenai/vaip_config_transformers.json

utils/ryzenai/notification_service.py

mht-sharma added 17 commits March 13, 2024 19:47

Add ryzen model

8c54971

restructure

a4c5326

update decoder

449e32b

update decoder

1ca2eb7

fix import

3e93153

fix tests

df893ab

fixed errors

29dd6d5

fixed errors

efb2082

set env varibales

268fbcc

add tests

7782a35

set env varibales

a8afdfe

rename macro

2e3e062

fix tests

da34127

fix tests

1408e77

add marker

fee83e3

update setup

0ca3e8f

update notifications

95cccda

mht-sharma marked this pull request as ready for review March 19, 2024 12:35

Merge branch 'main' into add_decoders

7f1344b

mht-sharma commented Mar 19, 2024

View reviewed changes

tests/ryzenai/rewriter.py Outdated Show resolved Hide resolved

mht-sharma commented Mar 19, 2024

View reviewed changes

mht-sharma added 7 commits March 19, 2024 19:31

update cloning

34be37a

add comment for has

4df250e

add error checks

e46b1bc

Merge branch 'main' into add_decoders

8cda1d9

Merge branch 'main' into add_decoders

2c01748

Merge branch 'main' into add_decoders

b2c5660

update eapis

fd86dbd

fxmarty self-requested a review March 26, 2024 11:50

fix error msgs

af34ab5

fxmarty reviewed Mar 26, 2024

View reviewed changes

mht-sharma added 7 commits March 27, 2024 13:18

review comments

ec3e463

review comments

d8ac628

Merge branch 'main' into add_decoders

af07dd1

fixed errors

9ebfe57

add pipeline

b175108

add pipeline

e0cb89f

add group

bb53a9b

Conversation

mht-sharma commented Mar 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

mht-sharma Mar 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fxmarty left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mht-sharma commented Mar 13, 2024 •

edited

Loading

mht-sharma Mar 19, 2024 •

edited

Loading