This work aims to identify classes of DOI mistakes by analysing the open bibliographic metadata available in Crossref, highlighting which publishers were responsible for such mistakes and how many of these incorrect DOIs could be corrected through automatic processes. By using a list of invalid cited DOIs gathered by OpenCitations while processing the OpenCitations Index of Crossref open DOI-to-DOI citations (COCI) in the past two years, we retrieved the citations in the January 2021 Crossref dump to such invalid DOIs. We processed these citations by keeping track of their validity and the publishers responsible for uploading the related citation data in Crossref. Finally, we identified patterns of factual errors in the invalid DOIs and the regular expressions needed to catch and correct them. The outcomes of this research show that only a few publishers were responsible for and/or affected by the majority of invalid citations. We extended the taxonomy of DOI name errors proposed in past studies and defined more elaborated regular expressions that can clean a higher number of mistakes in invalid DOIs than prior approaches. The data gathered in our study can enable investigating possible reasons for DOI mistakes from a qualitative point of view, helping publishers identify the problems underlying their production of invalid citation data. Also, the DOI cleaning mechanism we present could be integrated into the existing process (e.g. in COCI) to add citations by automatically correcting a wrong DOI. This study was run strictly following Open Science principles, and, as such, our research outcomes are fully reproducible.
Ekman's emotions (1992) are defined as universal basic emotions. Over the years, alternative models have emerged (e.g. Greene and Haidt 2002; Barrett 2017) describing emotions as social and linguistic constructions. The variety of models existing today raises the question of whether the abstraction provided by such models is sufficient as a descriptive/predictive tool for representing real-life emotional situations. Our study presents a social inquiry to test whether traditional models are sufficient to capture the complexity of daily life emotions, reported in a textual context. The intent of the study is to establish the human-subject agreement rate in an annotated corpus based on Ekman's theory (Entity-Level Tweets Emotional Analysis) and the human-subject agreement rate when using Ekman's emotions to annotate sentences that don’t respect the Ekman’s model (The Dictionary of Obscure Sorrows). Furthermore, we investigated how much alexithymia can influence the human ability to detect and categorise emotions. On a total sample of 114 subjects, our results show low within subjects agreement rates for both datasets, particularly for subjects with low levels of alexithymia; low levels of agreement with the original annotations; frequent use of emotions based on Ekman model, particularly negative one, in people with high levels of alexithymia.
A pre l i mi na ry note A pre l i mi na ry note This protocol illustrates the workflow adopted within a scholarly research that operates within the OpenCitations environment, which is an independent infrastructure organization for open scholarship dedicated to the publication of open bibliographic and citation data by the use of Semantic Web (Linked Data) technologies. COCI is the OpenCitations Index of Crossref open DOI-to-DOI citations.
A pre l i mi na ry note A pre l i mi na ry note This protocol illustrates the workflow adopted within a scholarly research that operates within the OpenCitations environment, which is an independent infrastructure organization for open scholarship dedicated to the publication of open bibliographic and citation data by the use of Semantic Web (Linked Data) technologies. COCI is the OpenCitations Index of Crossref open DOI-to-DOI citations.
A pre l i mi na ry note A pre l i mi na ry note This protocol illustrates the workflow adopted within a scholarly research that operates within the OpenCitations environment, which is an independent infrastructure organization for open scholarship dedicated to the publication of open bibliographic and citation data by the use of Semantic Web (Linked Data) technologies. COCI is the OpenCitations Index of Crossref open DOI-to-DOI citations. Purpose PurposeThe purpose of this research is to find the publishers responsible for the missing citations in COCI by sending incorrect metadata to Crossref, the publishers to whom such invalid citations point to and the number of previously invalid citations which are currently valid. The ultimate aim would be of contributing to the resolution of this type of problem in order to insert the citations now valid in COCI, and correct those still invalid always in order to increase the number of open citations available and indexed in the OpenCitations project. Study de si gn/ me thodol ogy Study de si gn/ me thodol ogy In the beginning, we use an already generated CSV file, containing the valid citing DOIs and the invalid cited DOIs, which is available from Peroni, S. (2021). Citations to invalid DOI-identified entities obtained from processing DOIto-DOI citations to add in COCI (1.0). Zenodo. https://doi.org/10.5281/ZENODO.4625300. These citations to invalid DOIs have been retrieved while processing Crossref data for adding open citations in COCI, but they have not been added in COCI since they point to a non-resolvable cited document. Two REST API services can be of help: the DOI REST API to check if the invalid cited DOI is now valid; and the Crossref REST API to retrieve the publisher from the prefix of the DOI, both for the cited publications and the citing ones. Fi ndi ngs Fi ndi ngsIn addition to collecting the names of the publishers involved in these missing citations, either as the publisher of the citing article or as the publisher of the cited article, which was sufficient to answer our research questions, we have decided to collect additional information that can help us to get a better picture of the situation. As regards the JSON file, we found for each individual publisher 1) the number of incorrect given citations metadata sent, and 2) the number of invalid citations received. On the other hand, as required by the initial research questions, we also extracted the total number of invalid citations that have since been corrected. O ri gi na l i ty/ va l ue O ri gi na l i ty/ va l ue The results of this research may point us to publishers who generally send out incorrect citation metadata and, inversely, those who generally receive invalid citations. These findings can first of all raise awareness of the accuracy of certain publishing houses in managing their metadata (or lack thereof). Moreover, finding these trends and showcasing the labor of the corrections may lead to increasingly valid citations if the proper measures are taken. Re se a rch l i mi ta ti ons/ i m...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.