Privacy concerns are constantly increasing in different sectors. Regulations such as the EU's General Data Protection Regulation (GDPR) are pressuring organizations to handle the individual's data with reinforced caution. As information systems deal with increasingly large amounts of personal data in essential services, there is a lack of mechanisms to help organizations in protecting the involved data subjects.In this paper, we propose and evaluate the use of Named Entity Recognition as a way to identify, monitor and validate Personally Identifiable Information. In our experiments, we used three of the most well-known Natural Language Processing tools (NLTK, Stanford CoreNLP, and spaCy). First, we assess the effectiveness of the tools with a generic dataset. Then, machine learning models are trained and evaluated with datasets built on data that contain personally identifiable information.The results show that models' performance was highly positive in accurately classifying both generic and more context-specific data. We observe the relationship between the datasets' training size and respective performance and estimate the appropriate size for model training within this context. Furthermore, we discuss how our proposal can effectively act as a Privacy Enhancing Technology as well as the potential risks and associated impacts.
As information systems deal with contracts and documents in essential services, there is a lack of mechanisms to help organizations in protecting the involved data subjects. In this paper, we evaluate the use of named entity recognition as a way to identify, monitor and validate personally identifiable information. In our experiments, we use three of the most well-known Natural Language Processing tools (NLTK, Stanford CoreNLP, and spaCy). First, the effectiveness of the tools is evaluated in a generic dataset. Then, the tools are applied in datasets built based on contracts that contain personally identifiable information. The results show that models' performance was highly positive in accurately classifying both the generic and the contracts' data. Furthermore, we discuss how our proposal can effectively act as a Privacy Enhancing Technology.
Index modulation (IM) has been attracting considerable research efforts in recent years as it is considered a promising technology that can enhance spectral and energy efficiency and help cope with the rising demand of mobile traffic in future wireless networks. In this paper, we propose a cloud radio access network (C-RAN) suitable for fifth-generation (5G) and beyond systems, where the base stations (BSs) and access points (APs) transmit multidimensional IM symbols, which we refer to as precoding-aided transmitter-side generalized space–frequency IM (PT-GSFIM). The adopted PT-GSFIM approach is an alternative multiuser multiple-input multiple-output (MU-MIMO) scheme that avoids multiuser interference (MUI) while exploiting the inherent diversity in frequency-selective channels. To validate the potential gains of the proposed PT-GSFIM-based C-RAN, a thorough system-level assessment is presented for three different three-dimensional scenarios taken from standardized 5G New Radio (5G NR), using two different numerologies and frequency ranges. Throughput performance results indicate that the 28 GHz band in spite of its higher bandwidth and higher achieved throughput presents lower spectral efficiency (SE). The 3.5 GHz band having lower bandwidth and lower achieved throughput attains higher SE. Overall, the results indicate that a C-RAN based on the proposed PT-GSFIM scheme clearly outperforms both generalized spatial modulation (GSM) and conventional MU-MIMO, exploiting its additional inherent frequency diversity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.