Asignificant rise in Artificial Intelligence (AI) has impacted many applications around us, so much so that AI has now been increasingly used in safety-critical applications. AI at the edge is the reality, which means performing the data computation closer to the source of the data, as opposed to performing it on the cloud. Safety-critical applications have strict reliability requirements; therefore, it is essential that AI models running on the edge (i.e., hardware) must fulfill the required safety standards. In the vast field of AI, Deep Neural Networks (DNNs) are the focal point of this survey as it has continued to produce extraordinary outcomes in various applications i.e. medical, automotive, aerospace, defense, etc. Traditional reliability techniques for DNNs implementation are not always practical, as they fail to exploit the unique characteristics of the DNNs. Furthermore, it is also essential to understand the targeted edge hardware because the impact of the faults can be different in ASICs and FPGAs. Therefore, in this survey, first, we have examined the impact of the fault in ASICs and FPGAs, and then we seek to provide a glimpse of the recent progress made towards the fault-tolerant DNNs. We have discussed several factors that can impact the reliability of the DNNs. Further, we have extended this discussion to shed light on many state-of-the-art fault mitigation techniques for DNNs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.