Skip to content

Add decoder modeling#108

Open
mht-sharma wants to merge 33 commits intomainfrom
add_decoders
Open

Add decoder modeling#108
mht-sharma wants to merge 33 commits intomainfrom
add_decoders

Conversation

@mht-sharma
Copy link
Copy Markdown
Contributor

@mht-sharma mht-sharma commented Mar 13, 2024

As per title!

  • Focus on OPT, Llama, Mistral
  • Add logits comparison tests with cpu runner
  • Adds notification on slack
  • Docs
  • Fix tests: Tests fail because of issue 118

Example Usage:

from optimum.amd.ryzenai import RyzenAIModelForCausalLM
from transformers import AutoTokenizer
from tests.ryzenai.testing_utils import DEFAULT_VAIP_CONFIG_TRANSFORMERS

model_path = # OPT/LLama model quantized using Brevitas
vaip_config = DEFAULT_VAIP_CONFIG_TRANSFORMERS
model = RyzenAIModelForCausalLM.from_pretrained(model_path, vaip_config=vaip_config)
tokenizer= AutoTokenizer.from_pretrained(model_path)

prompt = "Hey, are you conscious? Can you talk to me?"
inputs = tokenizer(prompt, return_tensors="pt")

generated_text = model.generate(**inputs, max_new_tokens=30, do_sample=False)
print(generated_text)

@mht-sharma mht-sharma marked this pull request as ready for review March 19, 2024 12:35
Copy link
Copy Markdown
Contributor Author

@mht-sharma mht-sharma Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have this configs somewhere else and use them by default in case user does not provide a config?
Maybe in optimum/amd/default_cfgs/
Could ask this in the meeting tomorrow

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@fxmarty fxmarty self-requested a review March 26, 2024 11:50
Copy link
Copy Markdown
Contributor

@fxmarty fxmarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

Comment on lines +179 to +183
# is_dynamic = RyzenAIModel._check_uses_static_shape(path)
# if is_dynamic and provider == "VitisAIExecutionProvider":
# raise ValueError(
# "The model provided has dynamic axes in input/output. Please provide model with static shapes for inference with RyzenAI."
# )
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be removed? Why is it commented out?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In their documentation, its mentioned

For CNN’s on NPU platform, dynamic input shapes are currently not supported and only a batch size of 1 is allowed. Please ensure that the shape of input is a fixed value, and the batch dimension is set to 1.

But since, LLMs support dynamic shapes had to comment it. Shifted the error to image-classification model as we do not support transformers model there yet.

Comment on lines +131 to +132
if config:
self.model_type = config.model_type
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config is hinted as PretrainedConfig, not Optional[PretrainedConfig] so I am not sure about the controflow here.

In general for simplicity, I think we should probably avoid dynamically defined instance attributes (i.e., all RyzenAIModel instances should (or should not) have a model_type attribute.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I added it because some of their pretrained models do not have config. So I made two changes now:

  1. Made config a hard requirement for RyzenAIModelForCausalLM
  2. Made it optional in RyzenAIModel


@classmethod
def _from_pretrained(
def _load_model_and_processors(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we discussed this already but why the renaming?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is not actually renamed, but a new method is created. I could reuse the _load_model_and_processors in _from_pretrained of CausalLM class. The UI is making it look like its renamed.

Comment on lines +56 to +62
self.use_fp16 = False
for inp in model.get_inputs():
if (
inp.name == "past_key_values" or inp.name in self.key_value_input_names
) and inp.type == "tensor(float16)":
self.use_fp16 = True
break
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fp16 is not and never will be supported no? We could probably remove fp16 related code if so

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did see some references of fp16 in their quantizer, but nothing concrete on device inference. So will remove it.

"RYZENAI_SW_PATH environment variable is not set. Attempting to clone RyzenAI-SW repository now...\n"
)
ryzenai_sw_path = normalize_path(os.path.join(os.getcwd(), "RyzenAI-SW"))
clone_repository("https://github.com/amd/RyzenAI-SW/", ryzenai_sw_path)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we instead have https://github.com/amd/RyzenAI-SW as a git submodule with a pinned commit (in order to avoid breaking changes)?

No strong opinion here, could anyway revisit later. Still probably a good idea to pin the commit.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am using a specific commit after clone to avoid breaking changes. Added the RYZEN_SW_COMMIT_HASH at top of file.

Comment on lines +305 to +307
"opt",
"llama",
"mistral",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gpt bigcode not here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added gpt bigcode, there are other models also which should be supported without a change. Would add them in other PR after full testing.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants