Abstract. Author name ambiguity is one of the problems that decrease the quality and reliability of information retrieved from digital libraries. Existing methods have tried to solve this problem by predefining a feature set based on expert's knowledge for a specific dataset. In this paper, we propose a new approach which uses deep neural network to learn features automatically for solving author name ambiguity. Additionally, we propose the general system architecture for author name disambiguation on any dataset. We evaluate the proposed method on a dataset containing Vietnamese author names. The results show that this method significantly outperforms other methods that use predefined feature set. The proposed method achieves 99.31% in terms of accuracy. Prediction error rate decreases from 1.83% to 0.69%, i.e., it decreases by 1.14%, or 62.3% relatively compared with other methods that use predefined feature set (Table 3).
Successful research collaborations may facilitate major outcomes in science and their applications. Thus, identifying effective collaborators may be a key factor that affects success. However, it is very difficult to identify potential collaborators and it is particularly difficult for young researchers who have less knowledge about other researchers and experts in their research domain. This study introduces and defines the problem of collaborator recommendation for 'isolated' researchers who have no links with others in coauthor networks. Existing approaches such as link-based and contentbased methods may not be suitable for isolated researchers because of their lack of links and content information. Thus, we propose a new approach that uses additional information as new features to make recommendations, i.e., the strength of the relationship between organizations, the importance rating, and the activity scores of researchers. We also propose a new method for evaluating the quality of collaborator recommendations. We performed experiments by crawling publications from the Microsoft Academic Search website. The metadata were extracted from these publications, including the year, authors, organizational affiliations of authors, citations, and references. The metadata from publications between 2001 and 2005 were used as the training data while those from 2006 to 2011 were used for validation. The experimental results demonstrated the effectiveness and efficiency of our proposed approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.