Using AI to write scholarly publications

Hosseini, Mohammad; Rasmussen, Lisa M.; Resnik, David B.

doi:10.1080/08989621.2023.2168535

Cited by 133 publications

(86 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…7 The policy also advises researchers who use these tools to document this use in the Methods or Acknowledgment sections of manuscripts. 7 Other journals 8,9 and organizations 10 are swiftly developing policies that ban inclusion of these nonhuman technologies as "authors" and that range from prohibiting the inclusion of AI-generated text in submitted work 8 to requiring full transparency, responsibility, and accountability for how such tools are used and reported in scholarly publication. 9,10 The International Conference on Machine Learning, which issues calls for papers to be reviewed and discussed at its conferences, has also announced a new policy: "Papers that include text generated from a large-scale language model (LLM) such as ChatGPT are…”

mentioning

confidence: 99%

“…The scholarly publishing community has quickly reported concerns about potential misuse of these language models in scientific publication . Individuals have experimented by asking ChatGPT a series of questions about controversial or important topics (eg, whether childhood vaccination causes autism) as well as specific publishing-related technical and ethical questions . Their results showed that ChatGPT’s text responses to questions, while mostly well written, are formulaic (which was not easily discernible), not up to date, false or fabricated, without accurate or complete references, and worse, with concocted nonexistent evidence for claims or statements it makes.…”

mentioning

confidence: 99%

“…1,[12][13][14] Individuals have experimented by asking ChatGPT a series of questions about controversial or important topics (eg, whether childhood vaccination causes autism) as well as specific publishing-related technical and ethical questions. 9,10,12 Their results showed that ChatGPT's text responses to questions, while mostly well written, are formulaic (which was not easily discernible), not up to date, false or fabricated, without accurate or complete references, and worse, with concocted nonexistent evidence for claims or statements it makes. OpenAI acknowledges some of the language model's limitations, including providing "plausible-sounding but incorrect or nonsensical answers," and that the recent release is part of an open iterative deployment intended for human use, interaction, and feedback to improve it.…”

mentioning

confidence: 99%

“…Nature has since defined a policy to guide the use of large-scale language models in scientific publication, which prohibits naming of such tools as a “credited author on a research paper” because “attribution of authorship carries with it accountability for the work, and AI tools cannot take such responsibility.” The policy also advises researchers who use these tools to document this use in the Methods or Acknowledgment sections of manuscripts . Other journals and organizations are swiftly developing policies that ban inclusion of these nonhuman technologies as “authors” and that range from prohibiting the inclusion of AI-generated text in submitted work to requiring full transparency, responsibility, and accountability for how such tools are used and reported in scholarly publication . The International Conference on Machine Learning, which issues calls for papers to be reviewed and discussed at its conferences, has also announced a new policy: “Papers that include text generated from a large-scale language model (LLM) such as ChatGPT are prohibited unless the produced text is presented as a part of the paper’s experimental analysis.” The society notes that this policy has generated a flurry of questions and that it plans “to investigate and discuss the impact, both positive and negative, of LLMs on reviewing and publishing in the field of machine learning and AI” and will revisit the policy in the future…”

mentioning

confidence: 99%

See 3 more Smart Citations

Nonhuman “Authors” and Implications for the Integrity of Scientific Publication and Medical Knowledge

Flanagin

Bibbins‐Domingo

Berkwits

et al. 2023

JAMA

259

130

View full text Add to dashboard Cite

Artificial intelligence (AI) technologies to help authors improve the preparation and quality of their manuscripts and published articles are rapidly increasing in number and sophistication. These include tools to assist with writing, grammar, language, references, statistical analysis, and reporting standards. Editors and publishers also use AI-assisted tools for myriad purposes, including to screen submissions for problems (eg, plagiarism, image manipulation, ethical issues), triage submissions, validate references, edit, and code content for publication in different media and to facilitate postpublication search and discoverability. 1 In November 2022, OpenAI released a new open source, natural language processing tool called ChatGPT. 2,3 ChatGPT is an evolution of a chatbot that is designed to simulate human conversation in response to prompts or questions (GPT stands for "generative pretrained transformer"). The release has prompted immediate excitement about its many potential uses 4 but also trepidation about potential misuse, such as concerns about using the language model to cheat on homework assignments, write student essays, and take examinations, including medical licensing examinations. 5 In January 2023, Nature reported on 2 preprints and 2 articles published in the science and health fields that included ChatGPT as a bylined author. 6 Each of these includes an affiliation for ChatGPT, and 1 of the articles includes an email address for the nonhuman "author." According to Nature, that article's inclusion of ChatGPT in the author byline was an "error that will soon be corrected." 6 However, these articles and their nonhuman "authors" have already been indexed in PubMed and Google Scholar.Nature has since defined a policy to guide the use of large-scale language models in scientific publication, which prohibits naming of such tools as a "credited author on a research paper" because "attribution of authorship carries with it accountability for the work, and AI tools cannot take such responsibility." 7 The policy also advises researchers who use these tools to document this use in the Methods or Acknowledgment sections of manuscripts. 7 Other journals 8,9 and organizations 10 are swiftly developing policies that ban inclusion of these nonhuman technologies as "authors" and that range from prohibiting the inclusion of AI-generated text in submitted work 8 to requiring full transparency, responsibility, and accountability for how such tools are used and reported in scholarly publication. 9,10 The International Conference on Machine Learning, which issues calls for papers to be reviewed and discussed at its conferences, has also announced a new policy: "Papers that include text generated from a large-scale language model (LLM) such as ChatGPT are

show abstract

mentioning

confidence: 99%

mentioning

confidence: 99%

mentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

Nonhuman “Authors” and Implications for the Integrity of Scientific Publication and Medical Knowledge

Flanagin

Bibbins‐Domingo

Berkwits

et al. 2023

JAMA

259

130

View full text Add to dashboard Cite

show abstract

“…[5] Similarly, many publication houses have come up with guidelines to the authors regarding the extent use of such AI tools in the manuscript, along with its justification. [6] The World Association of Medical Editors have recently laid down recommendations for the use of Chatbots in medical publishing. The recommendations say that these language processing tools cannot be eligible for authorship of any manuscript, authors of the manuscript will be responsible for all data generated by AI tools, authors should be transparent about usage of such tools, and the editors should have updated software to detect the use of such AI-assisted tools.…”

mentioning

confidence: 99%

Artificial intelligence-assisted medical writing: With greater power comes greater responsibility

Bains¹

2023

AJOHAS

View full text Add to dashboard Cite

The use of technology and artificial intelligence (AI) in medical publishing is not new. Researchers and publishers have been using it for grammar correction, managing references, editing tools, medical evidence synthesis in the form of systematic reviews, and plagiarism check software to name a few. [1,2] In November 2022, an AI-assisted natural language processing tool by the name of Chat Generative pre-trained (GPT) took the internet by the storm. Chat GPT, (where GPT stands for: GPT transformer) which is an evolution of a chatbot, was programmed to simulate human conversations and responses to prompts. [3] It gained immense popularity since its introduction, and soon the news related to its writing abilities for school essays, research publications, and even passing MBA examination at the reputed University of Pennsylvania, Whorton, USA was the talk of the day. [4] However, issues pertaining to its misuse, ethical concerns, and copyright issues were also raised. It has been seen as a potential threat to students' creativity and writing skills. With the use of such tools, the fine line between originality and a plagiarized write-up is overwhelmingly blurred as it produces a document as good as an "original" just taking prompts from the keywords put. For example, on putting the phrase "write an editorial on use of AI in medical writing" in the ChatGPT prompt box, it produced a 317 word long document within a matter of seconds [Figure 1]. It is very tempting, as one feels that the work is done instantly, without putting any efforts, but on the same hand, the user should keep this in mind that it is just processing a grammatically correct collection of words and sentences, without an actual human thought process or any references/ citations to validate its verity.Interestingly, in January 2023, Nature reported on two articles which listed ChatGPT as an author, and included an affiliation and email address for the "non-human" author, though now Nature has updated its policy "which prohibits naming of such tools as a "credited author on a research paper, " because "attribution of authorship carries with it accountability for the work, and AI tools cannot take such responsibility. " [5] Similarly, many publication houses have come up with guidelines to the authors regarding the extent use of such AI tools in the manuscript, along with its justification. [6] The World Association of Medical Editors have recently laid down recommendations for the use of Chatbots in medical publishing. The recommendations say that these language processing tools cannot be eligible for authorship of any manuscript, authors of the manuscript will be responsible for all data generated by AI tools, authors should be transparent about usage of such tools, and the editors should have updated software to detect the use of such AI-assisted tools. [7] Turnitin, the Plagiarism check software service provider, has already come up with solutions that canThis is an open-access article distributed under the terms of the Creative Commons Attribut...

show abstract

Accuracy and Reliability of Chatbot Responses to Physician Questions

Goodman,

Patrinely,

Stone

et al. 2023

JAMA Netw Open

162

View full text Add to dashboard Cite

ImportanceNatural language processing tools, such as ChatGPT (generative pretrained transformer, hereafter referred to as chatbot), have the potential to radically enhance the accessibility of medical information for health professionals and patients. Assessing the safety and efficacy of these tools in answering physician-generated questions is critical to determining their suitability in clinical settings, facilitating complex decision-making, and optimizing health care efficiency.ObjectiveTo assess the accuracy and comprehensiveness of chatbot-generated responses to physician-developed medical queries, highlighting the reliability and limitations of artificial intelligence–generated medical information.Design, Setting, and ParticipantsThirty-three physicians across 17 specialties generated 284 medical questions that they subjectively classified as easy, medium, or hard with either binary (yes or no) or descriptive answers. The physicians then graded the chatbot-generated answers to these questions for accuracy (6-point Likert scale with 1 being completely incorrect and 6 being completely correct) and completeness (3-point Likert scale, with 1 being incomplete and 3 being complete plus additional context). Scores were summarized with descriptive statistics and compared using the Mann-Whitney U test or the Kruskal-Wallis test. The study (including data analysis) was conducted from January to May 2023.Main Outcomes and MeasuresAccuracy, completeness, and consistency over time and between 2 different versions (GPT-3.5 and GPT-4) of chatbot-generated medical responses.ResultsAcross all questions (n = 284) generated by 33 physicians (31 faculty members and 2 recent graduates from residency or fellowship programs) across 17 specialties, the median accuracy score was 5.5 (IQR, 4.0-6.0) (between almost completely and complete correct) with a mean (SD) score of 4.8 (1.6) (between mostly and almost completely correct). The median completeness score was 3.0 (IQR, 2.0-3.0) (complete and comprehensive) with a mean (SD) score of 2.5 (0.7). For questions rated easy, medium, and hard, the median accuracy scores were 6.0 (IQR, 5.0-6.0), 5.5 (IQR, 5.0-6.0), and 5.0 (IQR, 4.0-6.0), respectively (mean [SD] scores were 5.0 [1.5], 4.7 [1.7], and 4.6 [1.6], respectively; P = .05). Accuracy scores for binary and descriptive questions were similar (median score, 6.0 [IQR, 4.0-6.0] vs 5.0 [IQR, 3.4-6.0]; mean [SD] score, 4.9 [1.6] vs 4.7 [1.6]; P = .07). Of 36 questions with scores of 1.0 to 2.0, 34 were requeried or regraded 8 to 17 days later with substantial improvement (median score 2.0 [IQR, 1.0-3.0] vs 4.0 [IQR, 2.0-5.3]; P &lt; .01). A subset of questions, regardless of initial scores (version 3.5), were regenerated and rescored using version 4 with improvement (mean accuracy [SD] score, 5.2 [1.5] vs 5.7 [0.8]; median score, 6.0 [IQR, 5.0-6.0] for original and 6.0 [IQR, 6.0-6.0] for rescored; P = .002).Conclusions and RelevanceIn this cross-sectional study, chatbot generated largely accurate information to diverse medical queries as judged by academic physician specialists with improvement over time, although it had important limitations. Further research and model development are needed to correct inaccuracies and for validation.

show abstract

Using AI to write scholarly publications

Cited by 133 publications

References 10 publications

Nonhuman “Authors” and Implications for the Integrity of Scientific Publication and Medical Knowledge

Nonhuman “Authors” and Implications for the Integrity of Scientific Publication and Medical Knowledge

Artificial intelligence-assisted medical writing: With greater power comes greater responsibility

Accuracy and Reliability of Chatbot Responses to Physician Questions

Contact Info

Product

Resources

About