-
Notifications
You must be signed in to change notification settings - Fork 15.8k
Optimizing #570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Optimizing #570
Conversation
Changes: init_distributed function: Extracted the distributed setup logic into a separate function. sample function: Modified it to use torch.multinomial instead of an exponentiation-based approach for sampling. Argument Validation: Replaced the assert with a more user-friendly validation in main to ensure that at least one of the parameters (input-file or interactive) is provided. Interactive Code Refactoring: The user interaction logic was kept, but the init_distributed function is now called separately at the beginning of main.
Here are the improvements made to the code for your commit message: Refactored init_distributed function: Extracted distributed setup logic into a separate function. Updated sample function: Replaced exponential approach with torch.multinomial for sampling. Improved argument validation: Replaced assert with a more user-friendly validation in main to ensure at least one parameter (input-file or interactive) is provided. Refactored interactive mode logic: Maintained user interaction logic but moved init_distributed call to the beginning of main.
uh oh, a Brazilian? bem vindo amigo, e bora lá |
inference/generate.py
Outdated
|
||
# Validate input | ||
if not (args.input_file or args.interactive): | ||
print("Erro: É necessário especificar --input-file ou ativar --interactive.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the new error message here are in Portuguese. Is this intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am Brazilian and i get that one language in a code is better, although, i would preffer to have some instructions in portuguese-Brazilian
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One language in code is essential, no more than one language. If we have two then most would probably be chinese.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so... Chinese is okay but Portuguese-Brazilian is not okay?
i don't get why people can't have multiple instructions in code, its not like there would come with any downside if the code and instructions are well structured
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no chinese, even though all of the main devs are chinese. There should only be english comments.
So much of the comments are in Portuguese (which is non-standard), the PR also includes significant changes to
|
- Improved readability and structure of Triton kernels for FP8 weight dequantization and matrix multiplication (GEMM) - Added comments for clarity - Replaced hardcoded block sizes with configurable parameters - Improved safety using tl.cdiv and masking - Renamed variables and ensured consistency in naming
Changes:
init_distributed function: Extracted the distributed setup logic into a separate function.
sample function: Modified it to use torch.multinomial instead of an exponentiation-based approach for sampling.
Argument Validation: Replaced the assert with a more user-friendly validation in main to ensure that at least one of the parameters (input-file or interactive) is provided.
Interactive Code Refactoring: The user interaction logic was kept, but the init_distributed function is now called separately at the beginning of main.
Refactored init_distributed function: Extracted distributed setup logic into a separate function.
Updated sample function: Replaced exponential approach with torch.multinomial for sampling.
Improved argument validation: Replaced assert with a more user-friendly validation in main to ensure at least one parameter (input-file or interactive) is provided.
Refactored interactive mode logic: Maintained user interaction logic but moved init_distributed call to the beginning of main.