is Deepspeed-Chat support tensor parallelism for Codegen

Hello,

I notice that DeepSpeed-Chat claims Codegen is supported up to 16B, but from previous issues https://github.com/microsoft/DeepSpeed/issues/3106
and from an earlier discussion at MII, https://github.com/microsoft/DeepSpeed-MII/issues/133

I also got the similar message that DeepSpeed not yet supports tensor parallelism for CodeGen 

for example "@Emerald01 The reason you are not seeing memory savings is because DeepSpeed-inference does not support automatic kernel injection with Codegen models at this time. Without the DeepSpeed kernels, we do not shard the model across GPUs. If you were to test with a model where we do support automatic injection (e.g., gpt2), you would see the memory per GPU is reduced."

I think DeepSpeed seems not support tensor parallelism for CodeGen, then the ZeRO stage cannot use 3 to split model across the node so far, am I right?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

is Deepspeed-Chat support tensor parallelism for Codegen #431

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

is Deepspeed-Chat support tensor parallelism for Codegen #431

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions