DoLA and SLED decoding strategies

LLMs are prone to hallucinations, i.e. generating content that deviates from facts seen during training. There are
two simple decoding strategies for reducing hallucinations with pretrained LLMs: DoLA and SLED. Both specifically improve factual accuracy by contrasting logits from different layers, a technique that requires the model's forward pass to extract intermediate layer outputs. This feature is not currently built into the standard mlx_lm.generate pipeline.

1. DoLA (https://arxiv.org/pdf/2309.03883) strategy is to choose the layer with the highest JSD as the premature layer, and the chosen layer will be contrasted with the final layer to update probabilities.
3. SLED (https://arxiv.org/pdf/2411.02433v3) minimizes KL divergence between the latent knowledge distribution from all premature layers and the output distribution.

Implementing both should be relatively easy since each individual model only needs to return intermediate layer outputs. The reference code for the final computation is also available.

Please, implement both options (or just SLED since it shows better metrics as per https://jayzhang42.github.io/sled_page/). Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DoLA and SLED decoding strategies #3115

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DoLA and SLED decoding strategies #3115

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions