-
Notifications
You must be signed in to change notification settings - Fork 567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Phi4 #2197
Add Phi4 #2197
Changes from all commits
1a43259
3630908
18f8bc5
e69a77c
a94b742
1d03294
bdf478f
78cd1e6
d8b2ea3
3d55e55
7ee22b6
e515f06
d1cae68
3c1780d
18c0033
fc1d2db
99a1ce5
ce626a4
e3768ee
c84c74c
cbc5ca1
dc64290
46bede4
cc36700
c9a483c
c1b6394
47dd749
146cac3
6e50261
55d7ae0
e7b43d6
b4de41d
4440768
d39e717
54d477d
0be4b8e
ad8562e
518a769
e29aca6
d533355
012f433
ebcd1d6
d4435b0
7f5ccd8
36eeaa8
af5a824
2002f50
01ac202
6003044
7aea0ca
4f38c14
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems that you made a copy from phi3, but made the changes in phi3/evaluation, instead of here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah I think these two eval files need to be swapped |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
# Config for EleutherEvalRecipe in eleuther_eval.py | ||
# | ||
# To launch, run the following command: | ||
# tune run eleuther_eval --config phi4/evaluation | ||
|
||
output_dir: ./ # Not needed | ||
|
||
# Model Arguments | ||
model: | ||
_component_: torchtune.models.phi4.phi4_14b | ||
|
||
# Checkpointer | ||
checkpointer: | ||
_component_: torchtune.training.FullModelHFCheckpointer | ||
checkpoint_dir: /tmp/phi-4 | ||
checkpoint_files: [ | ||
model-00001-of-00002.safetensors, | ||
model-00002-of-00002.safetensors | ||
] | ||
recipe_checkpoint: null | ||
output_dir: ${output_dir} | ||
model_type: PHI3_MINI | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. /PHI3_MINI/PHI4_MINI |
||
resume_from_checkpoint: False | ||
|
||
# Tokenizer | ||
tokenizer: | ||
_component_: torchtune.models.phi4.phi4_14b_tokenizer | ||
vocab_path: /tmp/phi-4/vocab.json | ||
merges_path: /tmp/phi-4/merges.txt | ||
max_seq_len: null | ||
|
||
# Environment | ||
device: cuda | ||
dtype: bf16 | ||
seed: 1234 # It is not recommended to change this seed, b/c it matches EleutherAI's default seed | ||
|
||
# EleutherAI specific eval args | ||
tasks: ["truthfulqa_mc2"] | ||
limit: null | ||
max_seq_length: 4096 | ||
batch_size: 8 | ||
enable_kv_cache: True | ||
|
||
# Quantization specific args | ||
quantizer: null |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
# Config for multi-device full finetuning in full_finetune_distributed.py | ||
# using a Phi4 16K Instruct | ||
# | ||
# This config assumes that you've run the following command before launching | ||
# this run: | ||
# tune download microsoft/phi-4 --output-dir /tmp/phi-4 --hf-token <HF_TOKEN> | ||
# | ||
# Run this config on 4 GPUs using the following: | ||
# tune run --nproc_per_node 4 full_finetune_distributed --config phi4/full | ||
# | ||
# You can add specific overrides through the command line. For example | ||
# to override the checkpointer directory while launching training | ||
# you can run: | ||
# tune run --nproc_per_node 4 full_finetune_distributed --config phi4/full checkpointer.checkpoint_dir=<YOUR_CHECKPOINT_DIR> | ||
# | ||
# This config works best when the model is being fine-tuned on 2+ GPUs. | ||
# Single device full finetuning requires more memory optimizations. It's | ||
# best to use mini_low_memory.yaml for those cases | ||
|
||
output_dir: /tmp/torchtune/phi-4/full # /tmp may be deleted by your system. Change it to your preference. | ||
|
||
# Model arguments | ||
model: | ||
_component_: torchtune.models.phi4.phi4_14b | ||
|
||
# Tokenizer | ||
tokenizer: | ||
_component_: torchtune.models.phi4.phi4_14b_tokenizer | ||
vocab_path: /tmp/phi-4/vocab.json | ||
merges_path: /tmp/phi-4/merges.txt | ||
max_seq_len: null | ||
|
||
# Checkpointer | ||
checkpointer: | ||
_component_: torchtune.training.FullModelHFCheckpointer | ||
checkpoint_dir: /tmp/phi-4 | ||
checkpoint_files: [ | ||
model-00001-of-00006.safetensors, | ||
model-00002-of-00006.safetensors, | ||
model-00003-of-00006.safetensors, | ||
model-00004-of-00006.safetensors, | ||
model-00005-of-00006.safetensors, | ||
model-00006-of-00006.safetensors, | ||
] | ||
recipe_checkpoint: null | ||
output_dir: ${output_dir} | ||
model_type: PHI3_MINI | ||
resume_from_checkpoint: False | ||
|
||
# Dataset | ||
dataset: | ||
_component_: torchtune.datasets.alpaca_cleaned_dataset | ||
packed: False # True increases speed | ||
seed: null | ||
shuffle: True | ||
|
||
# Fine-tuning arguments | ||
epochs: 1 | ||
max_steps_per_epoch: null | ||
batch_size: 2 | ||
gradient_accumulation_steps: 8 # Use to increase effective batch size | ||
optimizer: | ||
_component_: torch.optim.AdamW | ||
fused: True | ||
lr: 5e-6 | ||
loss: | ||
_component_: torchtune.modules.loss.CEWithChunkedOutputLoss | ||
compile: False # torch.compile the model + loss, True increases speed + decreases memory | ||
optimizer_in_bwd: False # True saves memory. Requires gradient_accumulation_steps=1 | ||
|
||
# Training env | ||
device: cuda | ||
|
||
# Memory management | ||
enable_activation_checkpointing: True # True reduces memory | ||
enable_activation_offloading: False # True reduces memory | ||
dtype: bf16 | ||
|
||
# Logging | ||
metric_logger: | ||
_component_: torchtune.training.metric_logging.DiskLogger | ||
log_dir: ${output_dir}/logs | ||
log_every_n_steps: 1 | ||
log_peak_memory_stats: True | ||
|
||
|
||
# Profiler (disabled) | ||
profiler: | ||
_component_: torchtune.training.setup_torch_profiler | ||
enabled: False | ||
|
||
#Output directory of trace artifacts | ||
output_dir: ${output_dir}/profiling_outputs | ||
|
||
#`torch.profiler.ProfilerActivity` types to trace | ||
cpu: True | ||
cuda: True | ||
|
||
#trace options passed to `torch.profiler.profile` | ||
profile_memory: False | ||
with_stack: False | ||
record_shapes: True | ||
with_flops: False | ||
|
||
# `torch.profiler.schedule` options: | ||
# wait_steps -> wait, warmup_steps -> warmup, active_steps -> active, num_cycles -> repeat | ||
wait_steps: 5 | ||
warmup_steps: 3 | ||
active_steps: 2 | ||
num_cycles: 1 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,111 @@ | ||
# Config for single device full finetuning in full_finetune_single_device.py | ||
# using a Phi4 16K Instruct | ||
# | ||
# This config assumes that you've run the following command before launching | ||
# this run: | ||
# tune download microsoft/phi-4 --output-dir /tmp/phi-4 --hf-token <HF_TOKEN> | ||
# | ||
# The default config uses an optimizer from bitsandbytes. If you do not have it installed, | ||
# you can install it with | ||
# pip install bitsandbytes | ||
# | ||
# To launch on a single device, run the following command from root: | ||
# tune run full_finetune_single_device --config phi4/full_low_memory | ||
# | ||
# You can add specific overrides through the command line. For example | ||
# to override the checkpointer directory while launching training | ||
# you can run: | ||
# tune run full_finetune_single_device --config phi4/full_low_memory checkpointer.checkpoint_dir=<YOUR_CHECKPOINT_DIR> | ||
# | ||
# This config works only for training on single device. | ||
|
||
output_dir: /tmp/torchtune/phi-4/full_low_memory # /tmp may be deleted by your system. Change it to your preference. | ||
|
||
# Model arguments | ||
model: | ||
_component_: torchtune.models.phi4.phi4_14b | ||
|
||
# Tokenizer | ||
tokenizer: | ||
_component_: torchtune.models.phi4.phi4_14b_tokenizer | ||
vocab_path: /tmp/phi-4/vocab.json | ||
merges_path: /tmp/phi-4/merges.txt | ||
max_seq_len: null | ||
|
||
# Checkpointer | ||
checkpointer: | ||
_component_: torchtune.training.FullModelHFCheckpointer | ||
checkpoint_dir: /tmp/phi-4 | ||
checkpoint_files: [ | ||
model-00001-of-00006.safetensors, | ||
model-00002-of-00006.safetensors, | ||
model-00003-of-00006.safetensors, | ||
model-00004-of-00006.safetensors, | ||
model-00005-of-00006.safetensors, | ||
model-00006-of-00006.safetensors, | ||
] | ||
recipe_checkpoint: null | ||
output_dir: ${output_dir} | ||
model_type: PHI3_MINI | ||
resume_from_checkpoint: False | ||
|
||
# Dataset | ||
dataset: | ||
_component_: torchtune.datasets.alpaca_cleaned_dataset | ||
packed: False # True increases speed | ||
seed: null | ||
shuffle: True | ||
|
||
# Fine-tuning arguments | ||
epochs: 1 | ||
max_steps_per_epoch: null | ||
batch_size: 2 | ||
gradient_accumulation_steps: 1 # Use to increase effective batch size | ||
optimizer: | ||
_component_: bitsandbytes.optim.PagedAdamW | ||
lr: 5e-6 | ||
optimizer_in_bwd: True # True saves memory. Requires gradient_accumulation_steps=1 | ||
loss: | ||
_component_: torchtune.modules.loss.CEWithChunkedOutputLoss | ||
compile: False # torch.compile the model + loss, True increases speed + decreases memory | ||
|
||
# Training env | ||
device: cuda | ||
|
||
# Memory management | ||
enable_activation_checkpointing: True # True reduces memory | ||
enable_activation_offloading: True # True reduces memory | ||
dtype: bf16 | ||
|
||
# Logging | ||
metric_logger: | ||
_component_: torchtune.training.metric_logging.DiskLogger | ||
log_dir: ${output_dir}/logs | ||
log_every_n_steps: 1 | ||
log_peak_memory_stats: True | ||
|
||
|
||
# Profiler (disabled) | ||
profiler: | ||
_component_: torchtune.training.setup_torch_profiler | ||
enabled: False | ||
|
||
#Output directory of trace artifacts | ||
output_dir: ${output_dir}/profiling_outputs | ||
|
||
#`torch.profiler.ProfilerActivity` types to trace | ||
cpu: True | ||
cuda: True | ||
|
||
#trace options passed to `torch.profiler.profile` | ||
profile_memory: False | ||
with_stack: False | ||
record_shapes: True | ||
with_flops: False | ||
|
||
# `torch.profiler.schedule` options: | ||
# wait_steps -> wait, warmup_steps -> warmup, active_steps -> active, num_cycles -> repeat | ||
wait_steps: 5 | ||
warmup_steps: 3 | ||
active_steps: 2 | ||
num_cycles: 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
folder is phi3, but args are phi4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, good point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bumping this comment. Please go through both Phi-3 and Phi-4 eval files to make sure they contain the correct model references