Skip to content

inplace hadamard#1641

Draft
wenhuach21 wants to merge 35 commits intomainfrom
hadamard_change
Draft

inplace hadamard#1641
wenhuach21 wants to merge 35 commits intomainfrom
hadamard_change

Conversation

@wenhuach21
Copy link
Copy Markdown
Contributor

@wenhuach21 wenhuach21 commented Mar 31, 2026

Description

Please briefly describe your main changes, the motivation.

  • delay rotation to save ram/vram
  • inference

Copilot AI review requested due to automatic review settings March 31, 2026 03:11
@wenhuach21 wenhuach21 marked this pull request as draft March 31, 2026 03:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR appears to change the experimental Hadamard-transform workflow (generation + application/patching) and introduces an “inplace” LLaMA2 rotation utility, while also attempting to enable a default Hadamard config from the CLI.

Changes:

  • Modified random Hadamard matrix construction and related docs in the transform utilities.
  • Refactored Hadamard application/patching logic (apply + WrapperLinear monkey-patches).
  • Added an experimental LLaMA2 “inplace” rotation module and changed CLI/BaseCompressor Hadamard configuration behavior.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
auto_round/experimental/transform/utils/hadamard.py Alters random Hadamard generation and leaves additional commented code.
auto_round/experimental/transform/patch_modules.py Changes WrapperLinear monkey-patching to apply transforms before quantization.
auto_round/experimental/transform/apply.py Updates Hadamard transform application (notably config serialization calls).
auto_round/experimental/hadamard_inplace/llama2.py Adds LLaMA2-specific rotation utilities (currently includes import-time script logic).
auto_round/compressors/base.py Changes how Hadamard config is handled during compressor init.
auto_round/main.py Forces a default Hadamard config when invoking tune().
Comments suppressed due to low confidence (2)

auto_round/experimental/transform/apply.py:100

  • In the input-transform branch, precision=module.dtype will fail for torch.nn.Linear (no .dtype attribute). Use module.weight.dtype (and similarly handle MXQuantLinearBase as needed) to avoid AttributeError.
    if location == "input":

        # activation needs transpose
        input_hadamard_transform = build_hadamard_transform(
            **config.model_dump(),
            location="input",
            inverse=True,
            device="cpu",
            precision=module.dtype,
        )

auto_round/experimental/transform/patch_modules.py:64

  • _qdq_act_patched assigns self.origin_qdq_act = self._qdq_act after the monkey-patch, so self._qdq_act already points to _qdq_act_patched. Calling self.origin_qdq_act(...) then recurses indefinitely. Capture the original method in a closure (e.g., orig_qdq_act = WrapperLinear._qdq_act) before patching and call that instead.
        self.origin_qdq_act = self._qdq_act
        x = inp_transform(x)

        return self.origin_qdq_act(x, act_max_scale, act_max)

    WrapperLinear._qdq_weight = _qdq_weight_patched
    WrapperLinear._qdq_act = _qdq_act_patched
    WrapperLinear._hadamard_patched = True


def patch_wrapperwalayer_forward_to_apply_transform(inp_transform):

Comment on lines 52 to 70
@@ -70,8 +69,8 @@ def random_hadamard_matrix(
:param gen: Optional generator random values
:return: randomly generated hadamard matrix
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The random_hadamard_matrix docstring still documents a dtype parameter, but the function signature no longer accepts it. Either reintroduce the dtype argument or update the docstring to avoid misleading callers.

Copilot uses AI. Check for mistakes.
if "lm_head" in name:
if "lm_head" in name: # TODO unrobust
continue
_apply_to_module(model, module, config, need_calibration, location)
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apply_hadamard_transform still calls _apply_to_module(model, module, config, need_calibration, location), but _apply_to_module now only accepts (module, config, location). This will raise TypeError at runtime; update the call and/or restore the removed parameters.

Suggested change
_apply_to_module(model, module, config, need_calibration, location)
_apply_to_module(module, config, location)

Copilot uses AI. Check for mistakes.
model_dtype=args.model_dtype,
momentum=args.momentum,
trust_remote_code=not args.disable_trust_remote_code,
hadamard_config="default",
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This forces hadamard_config="default" for every CLI tune run, changing default behavior and (with the current BaseCompressor changes) triggering the hadamard path without actually applying it / initializing self.hadamard_config, which can lead to downstream failures during save/export. Consider making this opt-in via a CLI flag or keep the previous default None.

Suggested change
hadamard_config="default",
hadamard_config=None,

Copilot uses AI. Check for mistakes.
@wenhuach21 wenhuach21 changed the title [not4landing]hadamard change inplace hadamard Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants