Model parameters size is zero when using fabric.sharded_model() context with deepspeed zero-3 strategy

### Bug description

Hi,

I use fabric with deepspeed zero-3 strategy to shard model among 2 gpus, and get ```Model params = 0.0 M``` of model size when using with fabric.sharded_model() context. 

```Python
import lightning as L

fabric = L.Fabric(accelerator="cuda", strategy='deepspeed_stage_3', precision='bf16-mixed')
fabric.launch()

with fabric.sharded_model():
    net = mymodel()
num_params = sum([param.nelement() for param in net.parameters()])
fabric.print('Model params = %2.1f M' % (num_params / 1000**2))
```

Without the fabric.sharded_model() context, I get the correct model size as ```Model params = 13.6 M```.
How to solve this issue? Thanks.


### What version are you seeing the problem on?

v2.0

### How to reproduce the bug

_No response_

### Error messages and logs

```
# Error messages and logs here please
```


### Environment

<details>
  <summary>Current environment</summary>

```
#- Lightning Component (e.g. Trainer, LightningModule, LightningApp, LightningWork, LightningFlow):
#- PyTorch Lightning Version (e.g., 1.5.0):
#- Lightning App Version (e.g., 0.5.2):
#- PyTorch Version (e.g., 2.0):
#- Python version (e.g., 3.9):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(`conda`, `pip`, source):
#- Running environment of LightningApp (e.g. local, cloud):
```

</details>


### More info

_No response_

cc @carmocca @justusschock @awaelchli

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Model parameters size is zero when using fabric.sharded_model() context with deepspeed zero-3 strategy #18514

Bug description

What version are you seeing the problem on?

How to reproduce the bug

Error messages and logs

Environment

More info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model parameters size is zero when using fabric.sharded_model() context with deepspeed zero-3 strategy #18514

Description

Bug description

What version are you seeing the problem on?

How to reproduce the bug

Error messages and logs

Environment

More info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions