Hi,
I also encountered issues with the optimizer state when I fine tuning the llama model in ada 6000.
My torch is 2.7.1
error:
File "/root/.local/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py", line 1382, in _convert_all_state_info
assert dtype == info.dtype
AssertionError
can you help me to solve this problem?
best regards!