-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Issues: deepspeedai/DeepSpeed
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[BUG] Question Regarding Weights After Reloading ZeroQuant Quantized W4A8 BERT Model
bug
Something isn't working
compression
#7060
opened Feb 20, 2025 by
RealJustinNi
[BUG] Unable to Use Something isn't working
compression
quantization_setting
for Customizing MoQ in DeepSpeed Inference
bug
#6853
opened Dec 11, 2024 by
cyx96
[BUG] Universal checkpointing doesn't work when changing model parallel size (pp and dp change are ok)
bug
Something isn't working
compression
#6503
opened Sep 8, 2024 by
exnx
[BUG]RuntimeError: disagreement between rank0 and rank1: rank0:
bug
Something isn't working
compression
#5799
opened Jul 24, 2024 by
yiyepiaoling0715
[BUG] pipline engine's training stucked when zero=1
bug
Something isn't working
compression
#5792
opened Jul 23, 2024 by
janelu9
[BUG] RuntimeError encountered when generating tokens from a Meta-Llama-3-8B-Instruct model initialized with 4-bit or 8-bit quantization
bug
Something isn't working
compression
#5644
opened Jun 11, 2024 by
Atry
[BUG] 4-bit quantized models would repeatedly generate the same tokens when bf16.enabled is true
bug
Something isn't working
compression
#5636
opened Jun 10, 2024 by
Atry
[BUG] RuntimeError: Error building extension 'fused_adam' Loading extension module fused_adam
bug
Something isn't working
compression
#5623
opened Jun 6, 2024 by
JinQiangWang2021
Change in checkpoint sizes?
bug
Something isn't working
compression
#5365
opened Apr 4, 2024 by
MilanConrad
[BUG]deepspeed+llama factory realizes the case of connection interruption in single multi-card fine-tuning and the need for amazing video memory
bug
Something isn't working
compression
#5222
opened Mar 4, 2024 by
mizzlefeng
Does deepspeed support qdq model used pytorch-quantization for training?
bug
Something isn't working
compression
#5068
opened Feb 4, 2024 by
shhn1
[BUG] qgZ doesn't work for odd number of nodes
bug
Something isn't working
compression
#5054
opened Feb 1, 2024 by
ByronHsu
[BUG] Deepspeed zero++ hpz hangs forever
bug
Something isn't working
compression
#5030
opened Jan 29, 2024 by
ByronHsu
[BUG]RuntimeError: the new group's world size should be less or equal to the world size set by init_process_group
bug
Something isn't working
compression
#4969
opened Jan 17, 2024 by
ArlanCooper
where is the gradient
bug
Something isn't working
compression
#4682
opened Nov 15, 2023 by
panjianfei
[BUG] HybridEngine llama2 70B generate result is wrong and "The size of tensor a (12) must match the size of tensor b (48) at non-singleton dimension 0" when inference_tp_size > 1
bug
Something isn't working
compression
#4345
opened Sep 15, 2023 by
xiaopqr
[BUG] ZeRO Stage 3 is consuming more memory than Stage 2 when using DeepSpeed-Chat
bug
Something isn't working
compression
#4305
opened Sep 12, 2023 by
puyuanOT
[BUG]RuntimeError: output tensor must have the same type as input tensor
bug
Something isn't working
compression
#3995
opened Jul 19, 2023 by
lw3259111
[BUG]AssertionError: Client Optimizer (type = <class 'NoneType'> is not instantiated but Client LR Scheduler is instantiated
bug
Something isn't working
compression
#3966
opened Jul 15, 2023 by
SeekPoint
[BUG] Getting .half() is not supported when using QLora
bug
Something isn't working
compression
#3719
opened Jun 8, 2023 by
abdulvirta
[BUG] if use zero3+tp, use save_16bit_model to save weights, only save 1/tp weights
bug
Something isn't working
compression
#3607
opened May 25, 2023 by
zte-tcb
[BUG] deepspeed_checkpoint.get_transformer_state has more PP_degree than real PP_degree
bug
Something isn't working
compression
#3588
opened May 22, 2023 by
jingxu9x
[BUG] MOQ Compression not work with ZeRO
bug
Something isn't working
compression
#3542
opened May 15, 2023 by
PYNing
[BUG]Single node single GPU work well but single node multi gpu hangs on
bug
Something isn't working
compression
#3529
opened May 12, 2023 by
zyh3826
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.