Request for AdamW8bit support on CPU (would help TorchTune)

### Feature request

Port AdamW8bit support for CPU from `multi-backend-refactor` branch to the main branch 

### Motivation

Public cloud providers' machines with GPUs are usually expensive while datacenter-grade CPUs are more readily available at lower prices. Towards the goal of making Deep Learning more accessible to developers & learners, the ability to finetune with AdamW8bit on CPU seems like a good milestone. `TorchTune` is currently unable to support[ full fine-tuning on CPU with `AdamW8bit`](https://github.com/pytorch/torchtune/blob/main/recipes/configs/llama3/8B_full_single_device.yaml) because it uses `bitsandbytes`' AdamW8bit optimizer.

<strike>#898 enabled `AdamW8bit` for CPU in `multi-backend-refactor` branch, but the main branch doesn't have it. </strike>

It'd be great if we could enable AdamW8bit for CPU in bitsandbytes main branch before TorchTune's next release (provided there would be a `bitsandbytes` release before that), so that users who'd install TorchTune would automatically end up installing a version of `bitsandbytes` that'd support `AdamW8bit` on CPU.

Thanks!

### Your contribution

@jianan-gu could port over his code from multi-backend-refactor branch to the main branch.

cc @mingfeima @ashokei @TimDettmers 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for AdamW8bit support on CPU (would help TorchTune) #1226

Feature request

Motivation

Your contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Request for AdamW8bit support on CPU (would help TorchTune) #1226

Description

Feature request

Motivation

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions