Massari, Arcangelo scite author profile

Massari, Arcangelo

4Publications

3Citation Statements Received

48Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Bologna

Publications

Order By: Most citations

Identifying and correcting invalid citations due to DOI errors in Crossref data

et al. 2022

View full text Add to dashboard Cite

This work aims to identify classes of DOI mistakes by analysing the open bibliographic metadata available in Crossref, highlighting which publishers were responsible for such mistakes and how many of these incorrect DOIs could be corrected through automatic processes. By using a list of invalid cited DOIs gathered by OpenCitations while processing the OpenCitations Index of Crossref open DOI-to-DOI citations (COCI) in the past two years, we retrieved the citations in the January 2021 Crossref dump to such invalid DOIs. We processed these citations by keeping track of their validity and the publishers responsible for uploading the related citation data in Crossref. Finally, we identified patterns of factual errors in the invalid DOIs and the regular expressions needed to catch and correct them. The outcomes of this research show that only a few publishers were responsible for and/or affected by the majority of invalid citations. We extended the taxonomy of DOI name errors proposed in past studies and defined more elaborated regular expressions that can clean a higher number of mistakes in invalid DOIs than prior approaches. The data gathered in our study can enable investigating possible reasons for DOI mistakes from a qualitative point of view, helping publishers identify the problems underlying their production of invalid citation data. Also, the DOI cleaning mechanism we present could be integrated into the existing process (e.g. in COCI) to add citations by automatically correcting a wrong DOI. This study was run strictly following Open Science principles, and, as such, our research outcomes are fully reproducible.

show abstract

Enabling Portability and Reusability of Open Science Infrastructures

Grieco

Heibi

Arcangelo

et al. 2022

View full text Add to dashboard Cite

Protocol: Investigating DOIs classes of errors v5

Boente¹,

Arcangelo²,

Santini³

et al. 2021

Preprint

View full text Add to dashboard Cite

The purpose of this protocol is to provide an automated process to repair invalid DOIs that have been collected by the OpenCitations Index Of Crossref Open DOI-To-DOI References (COCI) while processing data provided by Crossref. The data needed for this work is provided by Silvio Peroni as a CSV containing pairs of valid citing DOIs and invalid cited DOIs. With the goal to determine an automated process, we first classified the errors that characterize the wrong DOIs in the list. The starting hypothesis is that there are two main classes of errors: factual errors, such as wrong characters, and DOIs that are not yet valid at the time of processing. The first class can be furtherly divided into three classes: errors due to irrelevant strings added to the beginning (prefix-type errors) or at the end (suffixtype errors) of the correct DOI, and errors due to unwanted characters in the middle (other-type errors). Once the classes of errors are addressed, we propose automatic processes to obtain correct DOIs from wrong ones. These processes involve the use of the information returned from DOI API, the January 2021 Public Data File from Crossref, as well as rule-based methods, including regular expressions to correct invalid DOIs. The application of this methodology produced a CSV dataset containing all the pairs of citing and cited DOIs in the original dataset, each one enriched by 5 fields: "Already_Valid", which tells if the cited DOI was already valid before cleaning, "New_DOI", which contain a clean, valid DOI (if our procedure was able to produce one), and "prefix_error", "suffix_error" and "other-type_error" fields, which contain, for each cleaned DOI the number of errors that were cleaned.

show abstract

Plain X: re-design of the Paneangeli mobile application

Arcangelo

Tenan

Fabbri

2021

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.