Detecting and Mitigating Test-time Failure Risks via Model-agnostic Uncertainty 
Learning

Lahoti, Preethi; Gummadi, Krishna; Weikum, Gerhard

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Forschungspapier

Detecting and Mitigating Test-time Failure Risks via Model-agnostic Uncertainty Learning

MPG-Autoren

/persons/resource/persons225814

Lahoti, Preethi
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons45720

Weikum, Gerhard
Databases and Information Systems, MPI for Informatics, Max Planck Society;

Externe Ressourcen

Es sind keine externen Ressourcen hinterlegt

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

arXiv:2109.04432.pdf
(Preprint), 999KB

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Lahoti, P., Gummadi, K., & Weikum, G. (2021). Detecting and Mitigating Test-time Failure Risks via Model-agnostic Uncertainty Learning. Retrieved from https://arxiv.org/abs/2109.04432.

Zitierlink: https://hdl.handle.net/21.11116/0000-0009-6491-2

Zusammenfassung

Reliably predicting potential failure risks of machine learning (ML) systems
when deployed with production data is a crucial aspect of trustworthy AI. This
paper introduces Risk Advisor, a novel post-hoc meta-learner for estimating
failure risks and predictive uncertainties of any already-trained black-box
classification model. In addition to providing a risk score, the Risk Advisor
decomposes the uncertainty estimates into aleatoric and epistemic uncertainty
components, thus giving informative insights into the sources of uncertainty
inducing the failures. Consequently, Risk Advisor can distinguish between
failures caused by data variability, data shifts and model limitations and
advise on mitigation actions (e.g., collecting more data to counter data
shift). Extensive experiments on various families of black-box classification
models and on real-world and synthetic datasets covering common ML failure
scenarios show that the Risk Advisor reliably predicts deployment-time failure
risks in all the scenarios, and outperforms strong baselines.