Skip to content

llama_functions.py backend tasks #41

@sklam

Description

@sklam

Tasks for llama_functions.py that targets wip/llm_intgrate branch:

  • make llama_functions.py test-able by pytest ./llama_functions.py. We will need independent verification of the backend features.
  • create "inlined" version for the gen_* functions. see patch_inlined_matmul.patch for reference. Convert the existing functions to call the inlined version to populate the functions.
  • check matmul implementation.
    StringAttr.get("parallel") for _ in range(dims - 1)  # batch + M + N
                                              ^^^^^^^^ --- should this be `dims` (no minus 1)?
    
  • setup benchmark for attention() then play around with pass fuzzing. What passes can get more performance?

Future looking tasks :

  • can we use linalg.matmul or even linalg.batch_matmul? The batch version can do broadcast and transpose!
  • linalg dialect supports tensor. can we use tensor instead of memref? Need to figure out bufferization. What is the performance of memref based vs tensor based.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions