2019
DOI: 10.3389/fdata.2019.00045
|View full text |Cite
|
Sign up to set email alerts
|

A Review of Microsoft Academic Services for Science of Science Studies

Abstract: Since the relaunch of Microsoft Academic Services (MAS) 4 years ago, scholarly communications have undergone dramatic changes: more ideas are being exchanged online, more authors are sharing their data, and more software tools used to make discoveries and reproduce the results are being distributed openly. The sheer amount of information available is overwhelming for individual humans to keep up and digest. In the meantime, artificial intelligence (AI) technologies have made great strides and the cost of compu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
103
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 119 publications
(104 citation statements)
references
References 37 publications
0
103
0
1
Order By: Relevance
“…The data used for this study consists of all the papers included in the Microsoft Academic Graph (MAG) dataset up to December 31st, 2019 42 , 51 . This dataset includes records of scientific publications specifying the date of the publication, the authors’ names and affiliations, and the publication venue.…”
Section: Methodsmentioning
confidence: 99%
“…The data used for this study consists of all the papers included in the Microsoft Academic Graph (MAG) dataset up to December 31st, 2019 42 , 51 . This dataset includes records of scientific publications specifying the date of the publication, the authors’ names and affiliations, and the publication venue.…”
Section: Methodsmentioning
confidence: 99%
“…In stage 2, a neural-network-based method-word2vec (29) with standard settings-was used to quantitatively represent a paper's narrative content by defining the quantitative relationship (co-occurrence) of each word with every other word in the corpus of words in the training set. First, to establish a reliable estimate of word co-occurrences, we used data from the Microsoft Academic Graph (MAG) to train our word2vec model on 2 million scientific article abstracts that were published between 2000 and 2017 (21). This training set has about 200 million tokens (words, letters, or symbols) and 18 million sentences.…”
Section: Resultsmentioning
confidence: 99%
“…To evaluate the model's accuracy in predicting the true outcome of the manual replication, we calculated each paper's average prediction from its 100 rounds. For the initial training of the word2vec model, we used 2 million abstracts from the MAG database (21) to estimate the relationships among words in scientific papers. This step established a reliable quantification of word co-occurrences in scientific papers based on 200 million tokens (words, letters, or symbols) and 18 million sentences that was then used to digitally represent the word content of papers used in the analysis (n = 413).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Cluster Relationships To find the related topics between clusters we try two different approaches, one based on embedding their textual surface form with a language model, and another is based on the scores provided by Microsoft Academic's in-house knowledge graph (MAG) [50,58] of academic entities including topics, authors, venues and papers. We use these similarity scores to discover relationships between topics such as epidemiology and contact tracing.…”
Section: Group Linksmentioning
confidence: 99%