Over the years, Non-Intrusive Load Monitoring (NILM) research has focused on improving performance and more recently, generalizing over distinct datasets. However, the trustworthiness of the NILM model itself has hardly been addressed. To this end, it becomes important to provide a reasoning or explanation behind the predicted outcome for NILM models especially as machine learning models for NILM are often treated as black-box models. With this explanation, the models, not only can be improved, but also build trust for wider adoption within various applications. This paper demonstrates how some explainability tools can be used to explain the outcomes of a decision tree multi-classification approach for NILM and how model explainability informs feature selection and eventually improves performance. CCS CONCEPTS• Computing methodologies → Machine learning.
With the massive, worldwide, smart metering roll-out, both energy suppliers and users are starting to tap into the potential of higher resolution energy readings for accurate billing, improved demand response, improved tariffs better tuned to users and the grid, and empowering end-users to know how much their individual appliances contribute to their electricity bills via nonintrusive load monitoring (NILM). A number of NILM approaches, based on machine learning (ML), have been proposed over the years, focusing on improving the NILM model performance. However, the trustworthiness of the NILM model itself has hardly been addressed. It is important to explain the underlying model and its reasoning to understand why the model underperforms in order to satisfy user curiosity and to enable model improvement. This can be done by leveraging naturally interpretable or explainable models as well as explainability tools. This paper adopts a naturally interpretable decision tree (DT)-based approach for a NILM multiclass classifier. Furthermore, this paper leverages explainability tools to determine local and global feature importance, and design a methodology that informs feature selection for each appliance class, which can determine how well a trained model will predict an appliance on any unseen test data, minimising testing time on target datasets. We explain how one or more appliances can negatively impact classification of other appliances and predict appliance and model performance of the REFIT-data trained models on unseen data of the same house and on unseen houses on the UK-DALE dataset. Experimental results confirm that models trained with the explainability-informed local feature importance can improve toaster classification performance from 65% to 80%. Additionally, instead of one five-classifier approach incorporating all five appliances, a three-classifier approach comprising a kettle, microwave, and dishwasher and a two-classifier comprising a toaster and washing machine improves classification performance for the dishwasher from 72% to 94% and the washing machine from 56% to 80%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.