Improve Qlora docs & finetune qwen2-0.5b instruct example

Please provide detailed steps (all the necessary pip installs as well as the code from first line till last line) for qlora finetuning qwen2-0.5b on cpu. I as requesting this because the [qlora](https://github.com/intel/intel-extension-for-transformers/blob/main/docs/qloracpu.md) documentation is limited and only the first few lines of the code are provided. Code for aspects like loading dataset and merging the qlora weights are not mentioned. Also, the docs cross-refer to the neural-chat fine tune example. Even the (slightly modified) provided code
 
` import torch

from intel_extension_for_transformers.transformers.modeling import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(
    'Qwen/Qwen2-0.5B-Instruct',
    torch_dtype=torch.float32,
    load_in_4bit=True,
    use_neural_speed=False
)

from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training, TaskType

model = prepare_model_for_kbit_training(
    model, use_gradient_checkpointing=True
)

model.gradient_checkpointing_enable()

peft_config = LoraConfig(
    r=8,
    task_type=TaskType.CAUSAL_LM,
)

model = get_peft_model(model, peft_config) `
does not work and I get errors related to neural quant. 


So, I am asking for qwen2-0.5b-instruct because its finetuning is suitable for consumer PCs, it is a different architecture from those whose examples are provided(like llama & mpt) and also it is multilingual.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve Qlora docs & finetune qwen2-0.5b instruct example #1692

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve Qlora docs & finetune qwen2-0.5b instruct example #1692

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions