[k8s] Fail resource request early when there are no enough total GPU (including allocated and free) on k8s without autoscaling

For example if k8s has 16 GPU in total (including all allocated and free), a cluster request with 4 x H100:8 should fail early

```yaml
resources:
  accelerators: H100:8

num_nodes: 4
```

The cluster should not stuck in LAUNCHING state, but get rejected immediately during optimization