Proposal
When saving camera image we need to use below function to read the image value. When the env num goes above 256 with 4 camera each the fps goes to 1. Found that that function, which is just reading the output for each time step almost takes 1 sec. For using IsaacLab than other simulators, power was it can do parallel env but to save demos it is so slow. Is this normal for saving this amount of camera(4 each) with number of env(256 envs)?
Also the GPU Util fluctuates. 1 -> 1 -> 25 -> 1 -> 1 ...
