tests: enhance debuggability for issue 902#922
tests: enhance debuggability for issue 902#922jgehrcke merged 3 commits intokubernetes-sigs:mainfrom
Conversation
Signed-off-by: Dr. Jan-Philip Gehrcke <jgehrcke@nvidia.com>
Signed-off-by: Dr. Jan-Philip Gehrcke <jgehrcke@nvidia.com>
| local attrs=$(get_device_attrs_from_any_gpu_slice "gpu") | ||
| run get_device_attrs_from_any_gpu_slice "gpu" | ||
| assert_success | ||
| local attrs="$output" |
There was a problem hiding this comment.
This change (also in the other places) makes the test more explicitly fail during execution of get_device_attrs_from_any_gpu_slice. Without this change, the failure would be ignored as of the sub shell usage in local attrs=$(get_device_attrs_from_any_gpu_slice "gpu").
There was a problem hiding this comment.
Uh, here we capture both, stderr and stdout into output (that's how run works). That breaks things because deliberately in that function we emit log output to stderr, and payload output to stdout. So, we need this instead:
local attrs
attrs=$(get_device_attrs_from_any_gpu_slice "gpu")
This crashes the test when the sub shell fails, and cleanly directs log output to where it should be.
|
Oh, something is wrong with the patch. Will look into that soon. |
Signed-off-by: Dr. Jan-Philip Gehrcke <jgehrcke@nvidia.com>
|
Okay -- let's get this in; and on Monday we may already have a strong conclusion. I believe with the newly introduced active/dynamic waiting-for-slices-to-be-created the race condition may be properly alleviated. Let's see. |
Next time when #902 happens, we need to get the resource slice contents.
I've also increased the wait time because seemingly
This can probably be done more robustly (improving the liveness probe I think would be the best way to achieve that). If over the weekend this change results in less failures, we have gained knowledge. If over the weekend with this change we still get a failure, we will also win something as of the added log detail.