Information Extraction (IE) in Natural Language Processing (NLP) aims to extract structured information from unstructured text to assist a computer in understanding natural language. Machine learning-based IE methods bring more intelligence and possibilities but require an extensive and accurate labeled corpus. In the materials science domain, giving reliable labels is a laborious task that requires the efforts of many professionals. To reduce manual intervention and automatically generate materials corpus during IE, in this work, we propose a semi-supervised IE framework for materials via automatically generated corpus. Taking the superalloy data extraction in our previous work as an example, the proposed framework using Snorkel automatically labels the corpus containing property values. Then Ordered Neurons-Long Short-Term Memory (ON-LSTM) network is adopted to train an information extraction model on the generated corpus. The experimental results show that the F1-score of γ’ solvus temperature, density and solidus temperature of superalloys are 83.90%, 94.02%, 89.27%, respectively. Furthermore, we conduct similar experiments on other materials, the experimental results show that the proposed framework is universal in the field of materials.
Wi-Fi-based fingerprint indoor positioning technology has gained special attention, but the development of this technology has been full of challenges such as positioning time cost and positioning accuracy. Therefore, selecting reasonable Wireless Access Points (APs) for positioning is essential, as the more APs used for positioning, the higher the online computation, energy and time cost. Furthermore, the received signal strength (RSS) is easily affected by diverse interference (obstacles, multipath effects, etc.), decreasing the positioning accuracy. AP selection and positioning algorithms are proposed in this paper to solve these issues. The proposed AP selection algorithm fuses RSS distribution and interval overlap degree to select a small number of APs with high importance for positioning. The proposed positioning algorithm uses the location distance between reference points (RPs) to construct a circle and leverages extreme values (maximum and minimum values) of circles to determine the possibility that the test point (TP) appears in each circle, then it finds useful APs to determine the weight of RPs. Extensive experiments are conducted in two different areas, and the results show the effectiveness of the proposed algorithm.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.