To provide junior researchers with domain-specific concepts efficiently, an automatic approach for academic profiling is needed. First, to obtain personal records of a given scholar, typical supervised approaches often utilise structured data like infobox in Wikipedia as training dataset, but it may lead to a severe mis-labelling problem when they are utilised to train a model directly. To address this problem, a new relation embedding method is proposed for fine-grained entity typing, in which the initial vector of entities and a new penalty scheme are considered, based on the semantic distance of entities and relations. Also, to highlight critical concepts relevant to renowned scholars, scholars’ selective bibliographies which contain massive academic terms are analysed by a newly proposed extraction method based on logistic regression, AdaBoost algorithm and learning-to-rank techniques. It bridges the gap that conventional supervised methods only return binary classification results and fail to help researchers understand the relative importance of selected concepts. Categories of experiments on academic profiling and corresponding benchmark datasets demonstrate that proposed approaches outperform existing methods notably. The proposed techniques provide an automatic way for junior researchers to obtain organised knowledge in a specific domain, including scholars’ background information and domain-specific concepts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.