Open
Description
There are some problems with totalizations in the Bokeh GUI whenever there's more than one worker per host. They are particularly glaring on LocalClusters.
- In the
Workers
tab:- The cluster total for the columns
net read
,net write
,disk read
anddisk write
sums up the value for each worker. However, these are host-wide measures, so if two or more workers sit on the same host, the total will be double-counted. - The cluster total for the columns
gpu_memory_used
andgpu_utilization
sums up the value for each worker. However, two workers may share the same GPU, depending on theCUDA_VISIBLE_DEVICES
environment variable (seenvml.py
). If the variable is not set and in general on single-GPU hosts, all workers on the same host will share the same GPU. Again, this leads to double-counting. - The cluster total of the column
event_loop_interval
is a sum of the workers. This makes no sense; it should be a mean.
- The cluster total for the columns
- The
More... -> Workers Disk
andMore... -> Workers Network
tabs show one bar per worker. This is misleading; there should be one bar per host. - The
More... -> GPU Memory
andMore... -> GPU Utilization
tabs show one bar per worker. This is misleading; there should be one bar per GPU.