The Named Entity Recognition (NER) is an integrated task in many NLP applications such as machine translation, Information extraction and question answering. Arabic is one of the authorised spoken languages in the united nation. Currently, there is much Arabic information on the internet, so, nowadays the need for tools which process this information becomes significant. In this study, we have examined the impact of the conditional random field and the structured support vector machine in the task of Arabic NER. The structured support vector machine is the first time to be applied in the Arabic name entity recognition. Our proposed system has three stages: Preprocessing, extracting features and building model. We have used simple features like the bag of words in the [-1,1] window, the bag of part of speech tag in the [-1,1] window to enable our system to detect the multi-words entities. Also, we have tried to enhance the Stanford part of speech tagger to enhance the tagger output tags, which enabled our system to differentiate between the name entities from the nonentities. In addition, we have employed the binary features of: Is a person, is a prename, is a pre-location, is a location and is an organization. Our system has been trained and tested on part of ANER Crop. The results have proved that the conditional random field-based Arabic NER system outperforms the structured support vector machine-based Arabic NER using the same features set.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.