Optimizing #570

GabrielCaetanoo · 2025-02-05T00:35:17Z

Changes:

init_distributed function: Extracted the distributed setup logic into a separate function.
sample function: Modified it to use torch.multinomial instead of an exponentiation-based approach for sampling.
Argument Validation: Replaced the assert with a more user-friendly validation in main to ensure that at least one of the parameters (input-file or interactive) is provided.
Interactive Code Refactoring: The user interaction logic was kept, but the init_distributed function is now called separately at the beginning of main.

Refactored init_distributed function: Extracted distributed setup logic into a separate function.
Updated sample function: Replaced exponential approach with torch.multinomial for sampling.
Improved argument validation: Replaced assert with a more user-friendly validation in main to ensure at least one parameter (input-file or interactive) is provided.
Refactored interactive mode logic: Maintained user interaction logic but moved init_distributed call to the beginning of main.

Changes: init_distributed function: Extracted the distributed setup logic into a separate function. sample function: Modified it to use torch.multinomial instead of an exponentiation-based approach for sampling. Argument Validation: Replaced the assert with a more user-friendly validation in main to ensure that at least one of the parameters (input-file or interactive) is provided. Interactive Code Refactoring: The user interaction logic was kept, but the init_distributed function is now called separately at the beginning of main.

Here are the improvements made to the code for your commit message: Refactored init_distributed function: Extracted distributed setup logic into a separate function. Updated sample function: Replaced exponential approach with torch.multinomial for sampling. Improved argument validation: Replaced assert with a more user-friendly validation in main to ensure at least one parameter (input-file or interactive) is provided. Refactored interactive mode logic: Maintained user interaction logic but moved init_distributed call to the beginning of main.

Magos-Technicus · 2025-02-07T11:20:50Z

uh oh, a Brazilian? bem vindo amigo, e bora lá

a-holm · 2025-04-04T20:46:58Z

inference/generate.py

+
+    # Validate input
+    if not (args.input_file or args.interactive):
+        print("Erro: É necessário especificar --input-file ou ativar --interactive.")


the new error message here are in Portuguese. Is this intended?

i am Brazilian and i get that one language in a code is better, although, i would preffer to have some instructions in portuguese-Brazilian

One language in code is essential, no more than one language. If we have two then most would probably be chinese.

so... Chinese is okay but Portuguese-Brazilian is not okay?
i don't get why people can't have multiple instructions in code, its not like there would come with any downside if the code and instructions are well structured

There is no chinese, even though all of the main devs are chinese. There should only be english comments.

a-holm · 2025-04-04T20:49:20Z

So much of the comments are in Portuguese (which is non-standard), the PR also includes significant changes to inference/kernel.py that are not mentioned in the description:

Removal of act_quant / act_quant_kernel.
Rewriting of weight_dequant_kernel and fp8_gemm_kernel.
Removal of Triton autotuning (@triton.autotune).
Changes to the signatures and apparent logic of weight_dequant (renamed dequantize_weights) and fp8_gemm (e.g., scaling factors removed from fp8_gemm signature).

- Improved readability and structure of Triton kernels for FP8 weight dequantization and matrix multiplication (GEMM) - Added comments for clarity - Replaced hardcoded block sizes with configurable parameters - Improved safety using tl.cdiv and masking - Renamed variables and ensured consistency in naming

GabrielCaetanoo added 2 commits January 30, 2025 22:47

rakibulism approved these changes Feb 5, 2025

View reviewed changes

RedOrc approved these changes Feb 8, 2025

View reviewed changes

a-holm reviewed Apr 4, 2025

View reviewed changes

GabrielCaetanoo closed this Apr 9, 2025

GabrielCaetanoo reopened this Apr 9, 2025

Merge branch 'main' into main

a3f30dc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimizing #570

Optimizing #570

Uh oh!

GabrielCaetanoo commented Feb 5, 2025

Uh oh!

Magos-Technicus commented Feb 7, 2025

Uh oh!

a-holm Apr 4, 2025

Uh oh!

Magos-Technicus Apr 8, 2025

Uh oh!

a-holm Apr 8, 2025

Uh oh!

Magos-Technicus Apr 11, 2025

Uh oh!

a-holm Apr 11, 2025

Uh oh!

a-holm commented Apr 4, 2025

Uh oh!

Uh oh!

Optimizing #570

Are you sure you want to change the base?

Optimizing #570

Uh oh!

Conversation

GabrielCaetanoo commented Feb 5, 2025

Uh oh!

Magos-Technicus commented Feb 7, 2025

Uh oh!

a-holm Apr 4, 2025

Choose a reason for hiding this comment

Uh oh!

Magos-Technicus Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

a-holm Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

Magos-Technicus Apr 11, 2025

Choose a reason for hiding this comment

Uh oh!

a-holm Apr 11, 2025

Choose a reason for hiding this comment

Uh oh!

a-holm commented Apr 4, 2025

Uh oh!

Uh oh!