Transformer Uncertainty Estimation with Hierarchical Stochastic Attention

Pei, Jiahuan; Wang, Cheng; Szarvas, György

doi:10.1609/aaai.v36i10.21364

Cited by 3 publications

(2 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A model is perfectly calibrated if for some data distribution D, for all input pairs (x, y) ∈ D, if a model predicts p i = 0.8, then 80% such pairs have y i as a ground truth label. Work on calibration can be broadly categorized into post-hoc methods which calibrate models after training ( [12], [13], [14]), regularization methods during training ( [15], [16], [17], [18]), data augmentation methods ( [19], [20]), and alleviating miscalibration by injecting randomness with uncertainty estimation ( [21], [22], [23]). However, even a perfectly-calibrated model can make aberrant predictions.…”

Section: Related Workmentioning

confidence: 99%

A Saliency-based Clustering Framework for Identifying Aberrant Predictions

Montserrat,

Loftus,

Daihes

2023

LatinX in AI at Neural Information Processing Systems Conference 2023

View full text Add to dashboard Cite

In machine learning, classification tasks serve as the cornerstone of a wide range of real-world applications. Reliable, trustworthy classification is particularly intricate in biomedical settings, where the ground truth is often inherently uncertain and relies on high degrees of human expertise for labeling. Traditional metrics such as precision and recall, while valuable, are insufficient for capturing the nuances of these ambiguous scenarios. Here we introduce the concept of aberrant predictions, emphasizing that the nature of classification errors is as critical as their frequency. We propose a novel, efficient training methodology aimed at both reducing the misclassification rate and discerning aberrant predictions. Our framework demonstrates a substantial improvement in model performance, achieving a 20% increase in precision. We apply this methodology to the less-explored domain of veterinary radiology, where the stakes are high but have not been as extensively studied compared to human medicine. By focusing on the identification and mitigation of aberrant predictions, we enhance the utility and trustworthiness of machine learning classifiers in high-stakes, real-world scenarios, including new applications in the veterinary world.

show abstract

Section: Related Workmentioning

confidence: 99%

A Saliency-based Clustering Framework for Identifying Aberrant Predictions

Montserrat,

Loftus,

Daihes

2023

LatinX in AI at Neural Information Processing Systems Conference 2023

View full text Add to dashboard Cite

show abstract

“…This line of work aims to alleviate model miscalibration by injecting randomness.The popular methods are (1) Bayesian neural networks (Blundell et al 2015;Fortunato, Blundell, and Vinyals 2017), ( 2) ensembles (Lakshminarayanan, Pritzel, and Blundell 2017), (3) Monte Carlo(MC) dropout (Gal and Ghahramani 2016) and ( 4) Gumbel-softmax (Jang, Gu, and Poole 2017) based approaches (Wang, Lawrence, and Niepert 2021;Pei, Wang, and Szarvas 2022). The former three sub-categorgies have been discussed in recent surveys (Mena, Pujol, and Vitria 2021;Gawlikowski et al 2021) ).…”

Section: Uncertainty Estimationmentioning

confidence: 99%

Deep Learning for Edge Computing Applications: A State-of-the-Art Survey

et al. 2020

View full text Add to dashboard Cite

With the booming development of Internet-of-Things (IoT) and communication technologies such as 5G, our future world is envisioned as an interconnected entity where billions of devices will provide uninterrupted service to our daily lives and the industry. Meanwhile, these devices will generate massive amounts of valuable data at the network edge, calling for not only instant data processing but also intelligent data analysis in order to fully unleash the potential of the edge big data. Both the traditional cloud computing and on-device computing cannot sufficiently address this problem due to the high latency and the limited computation capacity, respectively. Fortunately, the emerging edge computing sheds a light on the issue by pushing the data processing from the remote network core to the local network edge, remarkably reducing the latency and improving the efficiency. Besides, the recent breakthroughs in deep learning have greatly facilitated the data processing capacity, enabling a thrilling development of novel applications, such as video surveillance and autonomous driving. The convergence of edge computing and deep learning is believed to bring new possibilities to both interdisciplinary researches and industrial applications. In this article, we provide a comprehensive survey of the latest efforts on the deep-learning-enabled edge computing applications and particularly offer insights on how to leverage the deep learning advances to facilitate edge applications from four domains, i.e., smart multimedia, smart transportation, smart city, and smart industry. We also highlight the key research challenges and promising research directions therein. We believe this survey will inspire more researches and contributions in this promising field. INDEX TERMS Internet of Things, edge computing, deep learning, intelligent edge applications.

show abstract