Robustness Metrics provides lightweight modules in order to evaluate the robustness of classification models across three sets of metrics: out-of-distribution generalization (e.g. a non-expert human would be able to classify similar objects, but possibly changed viewpoint, scene setting or clutter). stability (of the prediction and predicted probabilities) under natural perturbation of the input.