Simple, just make a plot showing the 100 worst classified datapoints in the validation set. Can be done in or using the code in `Validation scoring`.