Skip to content

is Deepspeed-Chat support tensor parallelism for Codegen #431

Open
@Emerald01

Description

@Emerald01

Hello,

I notice that DeepSpeed-Chat claims Codegen is supported up to 16B, but from previous issues deepspeedai/DeepSpeed#3106
and from an earlier discussion at MII, deepspeedai/DeepSpeed-MII#133

I also got the similar message that DeepSpeed not yet supports tensor parallelism for CodeGen

for example "@Emerald01 The reason you are not seeing memory savings is because DeepSpeed-inference does not support automatic kernel injection with Codegen models at this time. Without the DeepSpeed kernels, we do not shard the model across GPUs. If you were to test with a model where we do support automatic injection (e.g., gpt2), you would see the memory per GPU is reduced."

I think DeepSpeed seems not support tensor parallelism for CodeGen, then the ZeRO stage cannot use 3 to split model across the node so far, am I right?

Metadata

Metadata

Assignees

Labels

deespeed chatDeepSpeed ChatquestionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions