-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Issues: deepspeedai/DeepSpeedExamples
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
In step 3, I met a error when executing self.actor_model.eval()
deespeed chat
DeepSpeed Chat
hybrid engine
relating to the hybrid engine
#593
opened Jun 13, 2023 by
ZJXNEFU
what(): CUDA error: an illegal memory access was encountered
deespeed chat
DeepSpeed Chat
#592
opened Jun 13, 2023 by
zerlinkcn
[BUG] deepspeed-chat bloom training error, raise RuntimeError "still have inflight params " after 14 steps training of step3 with offload option turned on
deespeed chat
DeepSpeed Chat
new-config
A modified config from the given example
#591
opened Jun 12, 2023 by
DZ9
During the training of Step 3, the reward score of my language model collapsed to a stable point
deespeed chat
DeepSpeed Chat
modeling
Related to modeling questions.
#586
opened Jun 9, 2023 by
scarydemon2
Evaluation Loader for DeepSpeed Chat Example step 2 (reward model training)
deespeed chat
DeepSpeed Chat
modeling
Related to modeling questions.
#581
opened Jun 8, 2023 by
harveyp123
enable critic_gradient_checkpointing, get error
deespeed chat
DeepSpeed Chat
question
Further information is requested
#578
opened Jun 7, 2023 by
BaiStone2017
step3_rlhf_finetuning and two tokenizers
deespeed chat
DeepSpeed Chat
modeling
Related to modeling questions.
new-config
A modified config from the given example
question
Further information is requested
#577
opened Jun 6, 2023 by
GenVr
step2 bug fix for loss = nan when using BLOOM(which is left padding style)
deespeed chat
DeepSpeed Chat
modeling
Related to modeling questions.
#571
opened Jun 2, 2023 by
scarydemon2
messy response from model trained with opt-1.3b and Dahoas/rm-static
deespeed chat
DeepSpeed Chat
modeling
Related to modeling questions.
#569
opened Jun 2, 2023 by
treya-lin
Load model error in step3
bug
Something isn't working
deespeed chat
DeepSpeed Chat
#560
opened May 31, 2023 by
YingtongBu2
7B bloom cuda oom run step3 with 80 T4(16G) when hybridEngine is on, but ok when hybridengine is off
deespeed chat
DeepSpeed Chat
hybrid engine
relating to the hybrid engine
#559
opened May 31, 2023 by
wang990099
【problem discuss】Critic Loss can not decrease
deespeed chat
DeepSpeed Chat
modeling
Related to modeling questions.
#556
opened May 30, 2023 by
watermelon-lee
what data should I use in step 3
deespeed chat
DeepSpeed Chat
modeling
Related to modeling questions.
question
Further information is requested
#555
opened May 29, 2023 by
scarydemon2
【Need Help】What is [state, action, reward ] in NLP Scenario for PPO in deepspeed-chat
deespeed chat
DeepSpeed Chat
modeling
Related to modeling questions.
#552
opened May 26, 2023 by
valkryhx
Model and data downloading
deespeed chat
DeepSpeed Chat
question
Further information is requested
#550
opened May 26, 2023 by
treya-lin
step3 answer is not correct
deespeed chat
DeepSpeed Chat
modeling
Related to modeling questions.
#547
opened May 25, 2023 by
BaiStone2017
Recommended base docker image, torch/cuda version to use that is compatible with this code base?
deespeed chat
DeepSpeed Chat
system
An issue with a environment/system setup.
#546
opened May 23, 2023 by
abdulvirta
When to support the llama model?
deespeed chat
DeepSpeed Chat
llama
Questions related to llama model
#540
opened May 21, 2023 by
SunQiDong1999
The min_length setting force the model generate to max length, which produce repeated or nonsense result
deespeed chat
DeepSpeed Chat
modeling
Related to modeling questions.
#539
opened May 20, 2023 by
TheEighthDay
Hyper-param tuning for PPO
deespeed chat
DeepSpeed Chat
modeling
Related to modeling questions.
#532
opened May 17, 2023 by
luzai
When running step3, the error "CUDA error: misaligned address"?
deespeed chat
DeepSpeed Chat
#530
opened May 17, 2023 by
EircYangQiXin
Much more memory used in step 3 when using multi gpus compared to using single gpu
deespeed chat
DeepSpeed Chat
llama
Questions related to llama model
system
An issue with a environment/system setup.
#529
opened May 16, 2023 by
cokuehuang
Rewards in ppo seem to be recomputed many times
deespeed chat
DeepSpeed Chat
modeling
Related to modeling questions.
#528
opened May 16, 2023 by
dwyzzy
[bug]AttributeError: 'DeepSpeedHybridEngine' object has no attribute 'mp_group'
bug
Something isn't working
deespeed chat
DeepSpeed Chat
hybrid engine
relating to the hybrid engine
#525
opened May 15, 2023 by
qingchu123
OOM problem when fine-tune reward model with LLaMA in step 2
deespeed chat
DeepSpeed Chat
llama
Questions related to llama model
#521
opened May 11, 2023 by
kiseliu
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.