Skip to content

Update dependency peft to v0.19.1#61

Open
red-hat-konflux[bot] wants to merge 1 commit intokonflux-pocfrom
konflux/mintmaker/konflux-poc/peft-0.x
Open

Update dependency peft to v0.19.1#61
red-hat-konflux[bot] wants to merge 1 commit intokonflux-pocfrom
konflux/mintmaker/konflux-poc/peft-0.x

Conversation

@red-hat-konflux
Copy link
Copy Markdown

@red-hat-konflux red-hat-konflux Bot commented May 17, 2025

ℹ️ Note

This PR body was truncated due to platform limits.

This PR contains the following updates:

Package Change Age Confidence
peft ==0.3.0==0.19.1 age confidence

Warning

Some dependencies could not be looked up. Check the warning logs for more information.


Release Notes

huggingface/peft (peft)

v0.19.1

Compare Source

A small patch release containing these fixes:

Full Changelog: huggingface/peft@v0.19.0...v0.19.1

v0.19.0

Compare Source

Highlights

This PEFT release contains no less than nine new PEFT methods, described below. It also contains numerous enhancements that should make PEFT more useful to many users.

peft-v0 19 0
New Methods
GraLoRA

@​yeonjoon-jung01 added "GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning" to PEFT (#​2851). This method subdivides the base weight into smaller blocks and applies LoRA to those. This more granular adaptation promises to increase expressiveness and improve performance, especially at higher ranks (64+), closing the gap to full fine-tuning.

BD-LoRA

@​Conzel contributed BD-LoRA: "Block-Diagonal LoRA for Eliminating Communication Overhead in Tensor Parallel LoRA Serving" (#​2895). With BD-LoRA, the LoRA weights are implemented in a block-diagonal way. This allows to reduce communication overhead when using tensor parallelism (TP) and thus faster serving.

There is an experiment branch for BD-LoRA support in vLLM: vllm-project/vllm#28136.

Cartridges

Thanks to @​kashif, PEFT now also supports Cartridges (#​2953). The main purpose of this method is to train a prefix to compress a long context to a short size and thus save on tokens. On a low level, this is similar to prefix tuning. The PR also added an example recipe to quickly get started.

PVeRA

"PVeRA: Probabilistic Vector-Based Random Matrix Adaptation" was added to PEFT by @​leofillioux in #​2952. It is an extension of VeRA, a PEFT method that uses weight sharing between layers to be especially parameter efficient. PVeRA builds on top of that by adding a probabilistic element, sampling from the shared parameters and promising better performance overall.

PSOFT

@​fei407 added PSOFT, "Efficient Orthogonal Fine-Tuning with Principal Subspace Adaptation", to PEFT in #​3037. Orthogonal fine-tuning techniques like OFT and BOFT are good at preserving the structure and thus capabilities of the underlying base model. PSOFT improves efficiency of this technique by constraining the adaptation to low-rank principal subspace.

Lily

@​yibozhong added Lily: "Low-Rank Interconnected Adaptation across Layers" to PEFT in #​2563. Lily is on the surface similar to LoRA but has a sophisticated parameter sharing scheme. The A parameters are shared blockwise (e.g. 4 consecutive q_proj layers share the same A). There is a pool of B parameters that is shared globally, the actual B's are chosen in a data-dependent way through a router. This allows Lily to use higher ranks than LoRA while maintaining a low trainable parameter count.

PEANuT

In #​3084, "PEANuT: Parameter-Efficient Adaptation with Weight-aware Neural Tweakers" was added to PEFT, again by @​yibozhong. PEANuT adds a small, neural net (so called weight-aware neural tweakers) to the base model. Compared to LoRA, this increases expressivity for the same trainable parameter count or allows to greatly lower the parameter count without sacrificing expressivity. This comes at the expensive of a higher memory requirement for the same parameter count and decreased speed.

TinyLoRA

We have another serial contributor in @​kashif, who also contributed TinyLoRA: "Learning to Reason in 13 Parameters" in #​3024. This is a PEFT method that allows to train an extremely small number of parameters, much lower than what could be achieved even with LoRA rank 1. The paper shows that in particular with reinforcement learning, it can often be enough to train just a few parameters to achieve good results.

AdaMSS

@​LonglongaaaGo added "AdaMSS: Adaptive Multi-Subspace Approach for Parameter-Efficient Fine-Tuning" to PEFT. This method segments the base weights of the model into smaller subspaces that are targeted for fine-tuning. Moreover, it's possible to dynamically assign a lower parameter budget to less important subspaces during training, similar to what AdaLoRA does. This promises to provide higher expressiveness and better generalization than similar PEFT methods.

Enhancements
Convert non-LoRA adapters to LoRA

In #​2939, we added functions to PEFT to allow converting checkpoints of many non-LoRA methods into LoRA checkpoints. This can be useful because many other packages support only LoRA but not other PEFT methods, e.g. Diffusers and vLLM. With the new conversions tools, more PEFT methods than just LoRA can thus be used with those packages. Conversion is lossy but empirical testing showed that with a sufficiently high LoRA rank, the error can be quite low.

LoRA-GA

@​sambhavnoobcoder added a new way to initialize LoRA weights with "LoRA-GA: Low-Rank Adaptation with Gradient Approximation" (#​2926). This allows you to initialize the LoRA weights in a way that aligns the gradients with full fine-tuning and should lead to faster training convergence.

Reducing intruder dimensions

In "LoRA vs Full Fine-tuning: An Illusion of Equivalence", the authors showed that LoRA fine-tuning can introduce so-called "intruder dimensions" which contribute to forgetting. We now have a utility function to remove intruder dimension in PEFT, reduce_intruder_dimension. When calling this on a fine-tuned LoRA model, forgetting should be reduced while the fine-tuned task performance should remain almost the same.

Transformer Engine

In #​3048, @​balvisio added support for Transformer Engine, a quantization method by NVIDIA, to PEFT.

Tensor Parallel Support

In a series of PRs (#​3079, #​3091, #​3096), @​michaelbenayoun added support for Tensor Parallelism to LoRA.

Weight tying improvements

In many LLMs, the embedding and the LM head have tied weights to save on parameter count. This can, however, lead to tricky situations when trying to fine-tune those layers. Through a series of PRs (#​2803, #​2922, #​2870, #​2879, #​3126), we improved the user experience when doing so. Most notably, users can now pass ensure_weight_tying=True to their PEFT config to force weight tying to be upheld. Please check the PEFT weight tying docs for how weight tying is now being handled. Thanks to @​romitjain, @​sambhavnoobcoder, and @​Cursx for their contributions.

Low precsion floating type support

#​3055 makes LoRA work with base models that use very low precision floats like torch.float8_e4m3fn. An example of that would be MiniMax-M2.5.

Zero init for PrefixTuning

#​3128 introduces zero init to Prefix Tuning which, according to our benchmarks, reduced the result variance significantly and yielded good task accuracy without the need for prompt engineering.

LoftQ + int8 quantization

With #​3088 the LoftQ implementation now supports correcting errors for int8 quantization without utilizing activation thresholding alongside the already existing nf4 quantization.

Changes
Removal of Bone

The Bone PEFT method was removed in #​3115. Users are directed to use MiSS instead, which is the improved replacement for Bone. Use this Bone-to-MiSS conversion script if you want to port old Bone checkpoints.

AutoGPTQ and AutoAWQ

These two quantization methods now use GPTQModel as their backend (#​2932) thanks to @​ZX-ModelCloud.

Handling of requires_grad in modules_to_save

Previously, PEFT would enable requires_grad on the original module if the corresponding modules_to_save was disabled. This is almost never desirable and was thus fixed. Although this change is technically backwards-incompatible, it's an extreme niche case, so we don't expect any user to be negatively affected by it.

All Changes
New Contributors

Full Changelog: huggingface/peft@v0.18.1...v0.19.0

v0.18.1: 0.18.1

Compare Source

Small patch release containing the following changes:

  • #​2934: Small fixes required for some special cases to work with the upcoming transformers v5 release
  • #​2963: Fix to enable PEFT to run with AMD ROCm thanks to @​vladmandic
  • #​2976: Fix a regression that inadvertently required transformers >= 4.52

v0.18.0: 0.18.0: RoAd, ALoRA, Arrow, WaveFT, DeLoRA, OSF, and more

Compare Source

Highlights
peft-v0 18 0

FIXME update list of all changes, so some more commits were added

New Methods
RoAd

@​ppetrushkov added RoAd: 2D Rotary Adaptation to PEFT in #​2678. RoAd learns 2D rotation matrices that are applied using only element-wise multiplication, thus promising very fast inference with adapters in unmerged state.

Remarkably, besides LoRA, RoAd is the only PEFT method that supports mixed adapter batches. This means that when you have loaded a model with multiple RoAd adapters, you can use all of them for different samples in the same batch, which is much more efficient than switching adapters between batches:

model = PeftModel.from_pretrained(base_model, <path-to-road-adapter-A>, adapter_name="adapter-A")
model.add_adapter("adapter-B", <path-to-road-adapter-B>)

inputs = ...  # input with 3 samples

# apply adapter A to sample 0, adapter B to sample 1, and use the base model for sample 2:
adapter_names = ["adapter-A", "adapter-B", "__base__"]
output_mixed = model(**inputs, adapter_names=adapter_names)
gen_mixed = model.generate(**inputs, adapter_names=adapter_names)
ALoRA

Activated LoRA is a technique added by @​kgreenewald in #​2609 for causal language models, allowing to selectively enable LoRA adapters depending on a specific token invocation sequence in the input. This has the major benefit of being able to re-use most of the KV cache during inference when the adapter is only used to generate part of the response, after which the base model takes over again.

Arrow & GenKnowSub

@​TheTahaaa contributed not only support for Arrow, a dynamic routing algorithm between multiple loaded LoRAs in #​2644, but also GenKnowSub, a technique built upon Arrow where the 'library' of LoRAs available to Arrow is first modified by subtracting general knowledge adapters (e.g., trained on subsets of Wikipedia) to enhance task-specific performance.

WaveFT

Thanks to @​Bilican, Wavelet Fine-Tuning (WaveFT) was added to PEFT in #​2560. This method trains sparse updates in the wavelet domain of residual matrices, which is especially parameter efficient. It is very interesting for image generation, as it promises to generate diverse outputs while preserving subject fidelity.

DeLoRA

Decoupled Low-rank Adaptation (DeLoRA) was added by @​mwbini in #​2780. This new PEFT method is similar to DoRA in so far as it decouples the angle and magnitude of the learned adapter weights. However, DeLoRA implements this in a way that promises to better prevent divergence. Moreover, it constrains the deviation of the learned weight by imposing an upper limit of the norm, which can be adjusted via the delora_lambda parameter.

OSF

Orthogonal Fine-Tuning (OSF) was added by @​NikhilNayak-debug in #​2685. By freezing the high-rank subspace of the targeted weight matrices and projecting gradient updates to a low-rank subspace, OSF achieves good performance on continual learning tasks. While it is a bit memory intensive for standard fine-tuning processes, it is definitely worth checking out on tasks where performance degradation of previously learned tasks is a concern.

Enhancements
Text generation benchmark

In #​2525, @​ved1beta added the text generation benchmark to PEFT. This is a framework to determine and compare metrics with regard to text generation of different PEFT methods, e.g. runtime and memory usage. Right now, this benchmark is still lacking experimental settings and a visualization, analogous to what we have in the MetaMathQA benchmark. If this is something that interests you, we encourage you to let us know or, even better, contribute to this benchmark.

Reliable interface for integrations

PEFT has integrations with other libraries like Transformers and Diffusers. To facilitate this integration, PEFT now provides a stable interface of functions that should be used if applicable. For example, the set_adapter function can be used to switch between PEFT adapters on the model, even if the model is not a PeftModel instance. We commit to keeping these functions backwards compatible, so it's safe for other libraries to build on top of those.

Handling of weight tying

Some Transformers models can have tied weights. This is especially prevalent when it comes to the embedding and the LM head. Currently, the way that this is handled in PEFT is not obvious. We thus drafted an issue to illustrate the intended behavior in #​2864. This shows what our goal is, although not everything is implemented yet.

In #​2803, @​romitjain added the ensure_weight_tying argument to LoraConfig. This argument, if set to True, enforces weight tying of the modules targeted with modules_to_save. Thus, if embedding and LM head are tied, they will share weights, which is important to allow, for instance, weight merging. Therefore, for most users, we recommend to enable this setting if they want to fully fine-tune the embedding and LM head. For backwards compatability, the setting is off by default though.

Note that in accordance with #​2864, the functionality of ensure_weight_tying=True will be expanded to also include trainable tokens (#​2870) and LoRA (tbd.) in the future.

Support Conv1d and 1x1 Conv2 layers in LoHa and LoKr

@​grewalsk extended LoHa and LoKr to support nn.Conv1d layers, as well as nn.Conv2d with 1x1 kernels, in #​2515.

New prompt tuning initialization

Thanks to @​macmacmacmac, we now have a new initialization option for prompt tuning, random discrete initialization (#​2815). This option should generally work better than random initialization, as corroborated on our PEFT method comparison suite. Give it a try if you use prompt tuning.

Combining LoRA adapters with negative weights

If you use multiple LoRA adapters, you can merge them into a single adapter using model.add_weighted_adapter. However, so far, this only worked with positive weights per adapter. Thanks to @​sambhavnoobcoder and @​valteu, it is now possible to pass negative weights too.

Changes
Transformers compatibility

At the time of writing, the Transformers v5 release is imminent. This Transformers version will be incomptabile with PEFT < 0.18.0. If you plan to use Transformers v5 with PEFT, please upgrade PEFT to 0.18.0+.

Python version

This PEFT version no longer supports Python 3.9, which has reached its end of life. Please use Python 3.10+.

Updates to OFT

The OFT method has been updated to make it slightly faster and to stabilize the numerics in #​2805. This means, however, that existing checkpoints may give slightly different results after upgrading to PEFT 0.18.0. Therefore, if you use OFT, we recommend to retrain the adapter.

All Changes

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about these updates again.


  • If you want to rebase/retry this PR, check this box

To execute skipped test pipelines write comment /ok-to-test.


Documentation

Find out how to configure dependency updates in MintMaker documentation or see all available configuration options in Renovate documentation.

@coveralls
Copy link
Copy Markdown

coveralls commented May 17, 2025

Coverage Report for CI Build 24408673715

Warning

No base build found for commit d3564c8 on konflux-poc.
Coverage changes can't be calculated without a base build.
If a base build is processing, this comment will update automatically when it completes.

Coverage: 93.407%

Details

  • Patch coverage: No coverable lines changed in this PR.

Uncovered Changes

No uncovered changes found.

Coverage Regressions

Requires a base build to compare against. How to fix this →


Coverage Stats

Coverage Status
Relevant Lines: 91
Covered Lines: 85
Line Coverage: 93.41%
Coverage Strength: 3.87 hits per line

💛 - Coveralls

@red-hat-konflux red-hat-konflux Bot force-pushed the konflux/mintmaker/konflux-poc/peft-0.x branch from a3dffa8 to 10afd8b Compare July 5, 2025 05:02
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jul 5, 2025

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Comment @coderabbitai help to get the list of available commands and usage tips.

@red-hat-konflux red-hat-konflux Bot changed the title Update dependency peft to v0.15.2 Update dependency peft to v0.16.0 Jul 5, 2025
@red-hat-konflux red-hat-konflux Bot force-pushed the konflux/mintmaker/konflux-poc/peft-0.x branch from 10afd8b to b908a64 Compare August 9, 2025 08:24
@red-hat-konflux red-hat-konflux Bot changed the title Update dependency peft to v0.16.0 Update dependency peft to v0.17.0 Aug 9, 2025
@red-hat-konflux red-hat-konflux Bot force-pushed the konflux/mintmaker/konflux-poc/peft-0.x branch from b908a64 to 2f9947a Compare August 23, 2025 08:42
@red-hat-konflux red-hat-konflux Bot changed the title Update dependency peft to v0.17.0 Update dependency peft to v0.17.1 Aug 23, 2025
@red-hat-konflux red-hat-konflux Bot force-pushed the konflux/mintmaker/konflux-poc/peft-0.x branch from 2f9947a to ea2b697 Compare November 13, 2025 13:00
@red-hat-konflux red-hat-konflux Bot changed the title Update dependency peft to v0.17.1 Update dependency peft to v0.18.0 Nov 13, 2025
@red-hat-konflux red-hat-konflux Bot force-pushed the konflux/mintmaker/konflux-poc/peft-0.x branch from ea2b697 to 252abfc Compare January 9, 2026 16:56
@red-hat-konflux red-hat-konflux Bot changed the title Update dependency peft to v0.18.0 Update dependency peft to v0.18.1 Jan 9, 2026
@red-hat-konflux red-hat-konflux Bot force-pushed the konflux/mintmaker/konflux-poc/peft-0.x branch from 252abfc to d1670fa Compare April 14, 2026 15:45
@red-hat-konflux red-hat-konflux Bot changed the title Update dependency peft to v0.18.1 Update dependency peft to v0.19.0 Apr 14, 2026
Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
@red-hat-konflux red-hat-konflux Bot force-pushed the konflux/mintmaker/konflux-poc/peft-0.x branch from d1670fa to 87e912c Compare April 16, 2026 17:59
@red-hat-konflux red-hat-konflux Bot changed the title Update dependency peft to v0.19.0 Update dependency peft to v0.19.1 Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant