Privacy is important considering the financial Domain as such data is highly confidential and sensitive. Natural Language Processing (NLP) techniques can be applied for text classification and entity detection purposes in financial domains such as customer feedback sentiment analysis, invoice entity detection, categorisation of financial documents by type etc. Due to the sensitive nature of such data, privacy measures need to be taken for handling and training large models with such data. In this work, we propose a contextualized transformer (BERT and RoBERTa) based text classification model integrated with privacy features such as Differential Privacy (DP) and Federated Learning (FL). We present how to privately train NLP models and desirable privacyutility tradeoffs and evaluate them on the Financial Phrase Bank dataset.
Federated Learning (FL) enables the edge devices to collaboratively train a joint model without sharing their local data. This decentralised and distributed approach improves user privacy, security, and trust. Different variants of FL algorithms have presented promising results on both IID and skewed Non-IID data. However, the performance of FL algorithms is found to be sensitive to the FL system parameters and hyperparameters of the used model. In practice, tuning the right set of parameter settings for an FL algorithm is an expensive task. In this preregister paper, we propose an empirical investigation on five prominent FL algorithms to discover the relation between the FL System Parameters (FLSPs) and their performance. The FLSPs adds extra complexity to FL algorithms over a traditional ML system. We hypothesise that choosing the best FL algorithm for the given FLSP is not a trivial problem. Further, we endeavour to formulate a single easy-to-use metric which can describe the performance of an FL algorithm, thereby making the comparison simpler.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.