deepspeedai / DeepSpeed Public

Notifications You must be signed in to change notification settings
Fork 4.3k
Star 37.5k

Code
Issues 1k
Pull requests 112
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: deepspeedai/DeepSpeed

[Roadmap] DeepSpeed Roadmap Q1 2025

#6946 opened Jan 13, 2025 by loadams

Open 5

Labels 30 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear current search query, filters, and sorts

36 Open 43 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[BUG] Question Regarding Weights After Reloading ZeroQuant Quantized W4A8 BERT Model bug

Something isn't working

compression

#7060 opened Feb 20, 2025 by RealJustinNi

[BUG] Unable to Use quantization_setting for Customizing MoQ in DeepSpeed Inference bug

Something isn't working

compression

#6853 opened Dec 11, 2024 by cyx96

[BUG] Universal checkpointing doesn't work when changing model parallel size (pp and dp change are ok) bug

Something isn't working

compression

#6503 opened Sep 8, 2024 by exnx

[BUG]RuntimeError: disagreement between rank0 and rank1: rank0: bug

Something isn't working

compression

#5799 opened Jul 24, 2024 by yiyepiaoling0715

[BUG] pipline engine's training stucked when zero=1 bug

Something isn't working

compression

#5792 opened Jul 23, 2024 by janelu9

[BUG] RuntimeError encountered when generating tokens from a Meta-Llama-3-8B-Instruct model initialized with 4-bit or 8-bit quantization bug

Something isn't working

compression

#5644 opened Jun 11, 2024 by Atry

[BUG] 4-bit quantized models would repeatedly generate the same tokens when bf16.enabled is true bug

Something isn't working

compression

#5636 opened Jun 10, 2024 by Atry

[BUG] RuntimeError: Error building extension 'fused_adam' Loading extension module fused_adam bug

Something isn't working

compression

#5623 opened Jun 6, 2024 by JinQiangWang2021

Change in checkpoint sizes? bug

Something isn't working

compression

#5365 opened Apr 4, 2024 by MilanConrad

[BUG]deepspeed+llama factory realizes the case of connection interruption in single multi-card fine-tuning and the need for amazing video memory bug

Something isn't working

compression

#5222 opened Mar 4, 2024 by mizzlefeng

Does deepspeed support qdq model used pytorch-quantization for training？ bug

Something isn't working

compression

#5068 opened Feb 4, 2024 by shhn1

[BUG] qgZ doesn't work for odd number of nodes bug

Something isn't working

compression

#5054 opened Feb 1, 2024 by ByronHsu

[BUG] Deepspeed zero++ hpz hangs forever bug

Something isn't working

compression

#5030 opened Jan 29, 2024 by ByronHsu

[BUG]RuntimeError: the new group's world size should be less or equal to the world size set by init_process_group bug

Something isn't working

compression

#4969 opened Jan 17, 2024 by ArlanCooper

deepspeed memory bug

Something isn't working

compression

#4701 opened Nov 17, 2023 by lyzKF

where is the gradient bug

Something isn't working

compression

#4682 opened Nov 15, 2023 by panjianfei

[BUG] HybridEngine llama2 70B generate result is wrong and "The size of tensor a (12) must match the size of tensor b (48) at non-singleton dimension 0" when inference_tp_size > 1 bug

Something isn't working

compression

#4345 opened Sep 15, 2023 by xiaopqr

[BUG] ZeRO Stage 3 is consuming more memory than Stage 2 when using DeepSpeed-Chat bug

Something isn't working

compression

#4305 opened Sep 12, 2023 by puyuanOT

[BUG]RuntimeError: output tensor must have the same type as input tensor bug

Something isn't working

compression

#3995 opened Jul 19, 2023 by lw3259111

[BUG]AssertionError: Client Optimizer (type = <class 'NoneType'> is not instantiated but Client LR Scheduler is instantiated bug

Something isn't working

compression

#3966 opened Jul 15, 2023 by SeekPoint

[BUG] Getting .half() is not supported when using QLora bug

Something isn't working

compression

#3719 opened Jun 8, 2023 by abdulvirta

[BUG] if use zero3+tp, use save_16bit_model to save weights, only save 1/tp weights bug

Something isn't working

compression

#3607 opened May 25, 2023 by zte-tcb

[BUG] deepspeed_checkpoint.get_transformer_state has more PP_degree than real PP_degree bug

Something isn't working

compression

#3588 opened May 22, 2023 by jingxu9x

[BUG] MOQ Compression not work with ZeRO bug

Something isn't working

compression

#3542 opened May 15, 2023 by PYNing

[BUG]Single node single GPU work well but single node multi gpu hangs on bug

Something isn't working

compression

#3529 opened May 12, 2023 by zyh3826

Previous 1 2 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly