Search citation statements
Paper Sections
Citation Types
Year Published
Publication Types
Relationship
Authors
Journals
Background Cerebral hemorrhage is a critical medical condition that necessitates a rapid and precise diagnosis for timely medical intervention, including emergency operation. Computed tomography (CT) is essential for identifying cerebral hemorrhage, but its effectiveness is limited by the availability of experienced radiologists, especially in resource-constrained regions or when shorthanded during holidays or at night. Despite advancements in artificial intelligence–driven diagnostic tools, most require technical expertise. This poses a challenge for widespread adoption in radiological imaging. The introduction of advanced natural language processing (NLP) models such as GPT-4, which can annotate and analyze images without extensive algorithmic training, offers a potential solution. Objective This study investigates GPT-4’s capability to identify and annotate cerebral hemorrhages in cranial CT scans. It represents a novel application of NLP models in radiological imaging. Methods In this retrospective analysis, we collected 208 CT scans with 6 types of cerebral hemorrhages at Ren Ji Hospital, Shanghai Jiao Tong University School of Medicine, between January and September 2023. All CT images were mixed together and sequentially numbered, so each CT image had its own corresponding number. A random sequence from 1 to 208 was generated, and all CT images were inputted into GPT-4 for analysis in the order of the random sequence. The outputs were subsequently examined using Photoshop and evaluated by experienced radiologists on a 4-point scale to assess identification completeness, accuracy, and success. Results The overall identification completeness percentage for the 6 types of cerebral hemorrhages was 72.6% (SD 18.6%). Specifically, GPT-4 achieved higher identification completeness in epidural and intraparenchymal hemorrhages (89.0%, SD 19.1% and 86.9%, SD 17.7%, respectively), yet its identification completeness percentage in chronic subdural hemorrhages was very low (37.3%, SD 37.5%). The misidentification percentages for complex hemorrhages (54.0%, SD 28.0%), epidural hemorrhages (50.2%, SD 22.7%), and subarachnoid hemorrhages (50.5%, SD 29.2%) were relatively high, whereas they were relatively low for acute subdural hemorrhages (32.6%, SD 26.3%), chronic subdural hemorrhages (40.3%, SD 27.2%), and intraparenchymal hemorrhages (26.2%, SD 23.8%). The identification completeness percentages in both massive and minor bleeding showed no significant difference (P=.06). However, the misidentification percentage in recognizing massive bleeding was significantly lower than that for minor bleeding (P=.04). The identification completeness percentages and misidentification percentages for cerebral hemorrhages at different locations showed no significant differences (all P>.05). Lastly, radiologists showed relative acceptance regarding identification completeness (3.60, SD 0.54), accuracy (3.30, SD 0.65), and success (3.38, SD 0.64). Conclusions GPT-4, a standout among NLP models, exhibits both promising capabilities and certain limitations in the realm of radiological imaging, particularly when it comes to identifying cerebral hemorrhages in CT scans. This opens up new directions and insights for the future development of NLP models in radiology. Trial Registration ClinicalTrials.gov NCT06230419; https://clinicaltrials.gov/study/NCT06230419
Background Cerebral hemorrhage is a critical medical condition that necessitates a rapid and precise diagnosis for timely medical intervention, including emergency operation. Computed tomography (CT) is essential for identifying cerebral hemorrhage, but its effectiveness is limited by the availability of experienced radiologists, especially in resource-constrained regions or when shorthanded during holidays or at night. Despite advancements in artificial intelligence–driven diagnostic tools, most require technical expertise. This poses a challenge for widespread adoption in radiological imaging. The introduction of advanced natural language processing (NLP) models such as GPT-4, which can annotate and analyze images without extensive algorithmic training, offers a potential solution. Objective This study investigates GPT-4’s capability to identify and annotate cerebral hemorrhages in cranial CT scans. It represents a novel application of NLP models in radiological imaging. Methods In this retrospective analysis, we collected 208 CT scans with 6 types of cerebral hemorrhages at Ren Ji Hospital, Shanghai Jiao Tong University School of Medicine, between January and September 2023. All CT images were mixed together and sequentially numbered, so each CT image had its own corresponding number. A random sequence from 1 to 208 was generated, and all CT images were inputted into GPT-4 for analysis in the order of the random sequence. The outputs were subsequently examined using Photoshop and evaluated by experienced radiologists on a 4-point scale to assess identification completeness, accuracy, and success. Results The overall identification completeness percentage for the 6 types of cerebral hemorrhages was 72.6% (SD 18.6%). Specifically, GPT-4 achieved higher identification completeness in epidural and intraparenchymal hemorrhages (89.0%, SD 19.1% and 86.9%, SD 17.7%, respectively), yet its identification completeness percentage in chronic subdural hemorrhages was very low (37.3%, SD 37.5%). The misidentification percentages for complex hemorrhages (54.0%, SD 28.0%), epidural hemorrhages (50.2%, SD 22.7%), and subarachnoid hemorrhages (50.5%, SD 29.2%) were relatively high, whereas they were relatively low for acute subdural hemorrhages (32.6%, SD 26.3%), chronic subdural hemorrhages (40.3%, SD 27.2%), and intraparenchymal hemorrhages (26.2%, SD 23.8%). The identification completeness percentages in both massive and minor bleeding showed no significant difference (P=.06). However, the misidentification percentage in recognizing massive bleeding was significantly lower than that for minor bleeding (P=.04). The identification completeness percentages and misidentification percentages for cerebral hemorrhages at different locations showed no significant differences (all P>.05). Lastly, radiologists showed relative acceptance regarding identification completeness (3.60, SD 0.54), accuracy (3.30, SD 0.65), and success (3.38, SD 0.64). Conclusions GPT-4, a standout among NLP models, exhibits both promising capabilities and certain limitations in the realm of radiological imaging, particularly when it comes to identifying cerebral hemorrhages in CT scans. This opens up new directions and insights for the future development of NLP models in radiology. Trial Registration ClinicalTrials.gov NCT06230419; https://clinicaltrials.gov/study/NCT06230419
BACKGROUND Cerebral hemorrhage is a critical medical condition which necessitates a rapid and precise diagnosis for timely medical intervention including emergency operation. Computed Tomography (CT) is essential for identifying cerebral hemorrhage, while its effectiveness is limited by the availability of experienced radiologists, especially in resource-constrained regions or when shorthanded during holidays or night. Despite advancements in artificial intelligence (AI)-driven diagnostic tools, most of which require technical expertise, posing a challenge for widespread adoption in radiological imaging. The introduction of advanced natural language processing (NLP) models such as GPT-4, which can annotate and analyze images without extensive algorithmic training, offers a potential solution. This study investigates GPT-4's capability to identify and annotate cerebral hemorrhage in cranial CT scans, a novel application of NLP models in radiological imaging. OBJECTIVE CT scans with six types of cerebral hemorrhage collected at Ren Ji Hospital, Shanghai Jiao Tong University School of Medicine between January and September in 2023. METHODS In this retrospective analysis, we collected 208 CT scans with six types of cerebral hemorrhage at Ren Ji Hospital, Shanghai Jiao Tong University School of Medicine between January and September in 2023. All CT images were randomly fed into GPT-4 for the analysis and annotation of cerebral hemorrhage. The outputs were subsequently examined using Photoshop and evaluated by experienced radiologists on a 4-point scale to assess the identification completeness, accuracy and success. RESULTS The overall identification completeness percentage for six types of cerebral hemorrhage was 72.59 ± 18.62%. Specifically, GPT-4 achieved higher identification completeness percentages in epidural and intraparenchymal hemorrhages (89.02 ± 19.01%, 86.86 ± 17.69%, repectively), yet its identification completeness percentage in chronic subdural hemorrhages was very low (37.35 ± 37.50%). The misidentification percentages for complex hemorrhage, epidural hemorrhage and subarachnoid hemorrhage were relatively high (54.00 ± 28.04%, 50.25 ± 22.65%, 50.54 ± 29.18%, respectively), whereas they were relatively low for acute subdural hemorrhage, chronic subdural hemorrhage and intraparenchymal hemorrhage (32.61 ± 26.27%, 40.34 ± 27.19%, 26.24 ± 23.85%, respectively). The identification completeness percentages in both massive and minor bleeding showed no significant difference. However, the misidentification percentage in recognizing massive bleeding was significantly lower than that for minor bleeding. The identification completeness percentage and misidentification percentage for cerebral hemorrhage at different locations showed no significant differences. At last, radiologists showed relative acceptance regarding the identification completeness, accuracy and success (3.60 ± 0.54, 3.30 ± 0.65, 3.38 ± 0.64, respectively). CONCLUSIONS GPT-4, a standout among NLP models, exhibits both promising capabilities and certain limitations in the realm of radiological imaging, particularly when it comes to identifying cerebral hemorrhages in CT scans. This opens up new directions and insights for the future development of NLP models in radiology. CLINICALTRIAL This retrospective study was registered at ClinicalTrials.gov (NCT06230419) and approved by the Ethics Committee of Ren Ji Hospital, Shanghai Jiao Tong University School of Medicine.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.