Background Chat Generative Pre-trained Transformer (ChatGPT) is a 175-billion-parameter natural language processing model that can generate conversation-style responses to user input. Objective This study aimed to evaluate the performance of ChatGPT on questions within the scope of the United States Medical Licensing Examination Step 1 and Step 2 exams, as well as to analyze responses for user interpretability. Methods We used 2 sets of multiple-choice questions to evaluate ChatGPT’s performance, each with questions pertaining to Step 1 and Step 2. The first set was derived from AMBOSS, a commonly used question bank for medical students, which also provides statistics on question difficulty and the performance on an exam relative to the user base. The second set was the National Board of Medical Examiners (NBME) free 120 questions. ChatGPT’s performance was compared to 2 other large language models, GPT-3 and InstructGPT. The text output of each ChatGPT response was evaluated across 3 qualitative metrics: logical justification of the answer selected, presence of information internal to the question, and presence of information external to the question. Results Of the 4 data sets, AMBOSS-Step1, AMBOSS-Step2, NBME-Free-Step1, and NBME-Free-Step2, ChatGPT achieved accuracies of 44% (44/100), 42% (42/100), 64.4% (56/87), and 57.8% (59/102), respectively. ChatGPT outperformed InstructGPT by 8.15% on average across all data sets, and GPT-3 performed similarly to random chance. The model demonstrated a significant decrease in performance as question difficulty increased (P=.01) within the AMBOSS-Step1 data set. We found that logical justification for ChatGPT’s answer selection was present in 100% of outputs of the NBME data sets. Internal information to the question was present in 96.8% (183/189) of all questions. The presence of information external to the question was 44.5% and 27% lower for incorrect answers relative to correct answers on the NBME-Free-Step1 (P<.001) and NBME-Free-Step2 (P=.001) data sets, respectively. Conclusions ChatGPT marks a significant improvement in natural language processing models on the tasks of medical question answering. By performing at a greater than 60% threshold on the NBME-Free-Step-1 data set, we show that the model achieves the equivalent of a passing score for a third-year medical student. Additionally, we highlight ChatGPT’s capacity to provide logic and informational context across the majority of answers. These facts taken together make a compelling case for the potential applications of ChatGPT as an interactive medical education tool to support learning.
Background: ChatGPT is a 175 billion parameter natural language processing model which can generate conversation style responses to user input. Objective: To evaluate the performance of ChatGPT on questions within the scope of United States Medical Licensing Examination (USMLE) Step 1 and Step 2 exams, as well as analyze responses for user interpretability. Methods: We used two novel sets of multiple choice questions to evaluate ChatGPT's performance, each with questions pertaining to Step 1 and Step 2. The first was derived from AMBOSS, a commonly used question bank for medical students, which also provides statistics on question difficulty and the performance on an exam relative to the userbase. The second, was the National Board of Medical Examiners (NBME) Free 120-question exams. After prompting ChatGPT with each question, ChatGPT's selected answer was recorded, and the text output evaluated across three qualitative metrics: logical justification of the answer selected, presence of information internal to the question, and presence of information external to the question. Results: On the four datasets, AMBOSS-Step1, AMBOSS-Step2, NBME-Free-Step1, and NBME-Free- Step2, ChatGPT achieved accuracies of 44%, 42%, 64.4%, and 57.8%. The model demonstrated a significant decrease in performance as question difficulty increased (P=.012) within the AMBOSS- Step1 dataset. We found logical justification for ChatGPT's answer selection was present in 100% of outputs. Internal information to the question was present in >90% of all questions. The presence of information external to the question was respectively 54.5% and 27% lower for incorrect relative to correct answers on the NBME-Free-Step1 and NBME-Free-Step2 datasets (P<=.001). Conclusion: ChatGPT marks a significant improvement in natural language processing models on the tasks of medical question answering. By performing at greater than 60% threshold on the NBME-Free- Step-1 dataset we show that the model is comparable to a third year medical student. Additionally, due to the dialogic nature of the response to questions, we demonstrate ChatGPT's ability to provide reasoning and informational context across the majority of answers. These facts taken together make a compelling case for the potential applications of ChatGPT as a medical education tool.
The identification of accurate harbingers of disease status and therapeutic efficacy are critical requirements in precise diagnosis and effective management. Initially, tissue analysis was regarded as ideal but invasive strategies represent risk compared with peripheral blood sampling. Thus far, most biomarkers, whether in tissue or blood/urine, have been single analytes with varying degrees of sensitivity and specificity. Some analytes have not exhibited robust metrics or have lacked methodological rigor. Neuroendocrine disease represents an area of dire biomarker paucity since the individual biomarkers (gastrin, insulin, etc.) are not widely applicable to the diverse types of neuroendocrine neoplasia. Broad-spectrum markers such as chromogranin A have limitations in sensitivity, specificity and reproducibility. Monoanalytes cannot define the multiple variables (proliferation, metabolic activity, invasive potential, metastatic propensity) that constitute tumor growth. The restricted status of the neuroendocrine neoplasia field has resulted in a lack of comprehensive knowledge of the molecular and cellular biology of the disease, with tardy application of innovative technology. This overview examines limitations in current practice and describes contemporary viable strategies under evaluation, including the identification of novel analytes (gene transcripts, microRNA), circulating tumor cells and metabolic imaging agents that identify disease. Novel requirements are necessary to develop biomathematical algorithms for synchronous calibration of multiple molecular markers and predictive nomograms that interface biological variables to delineate disease progress or treatment efficacy. Optimally, the application of novel techniques and amalgamations of multianalyte assessment will provide a personalized molecular disease signature extrapolative of neuroendocrine neoplasia status and likelihood of progression and predictive of therapeutic opportunity.
BackgroundThe chromatin remodeler NAP1L1, which is upregulated in small intestinal neuroendocrine neoplasms (NENs), has been implicated in cell cycle progression. As p57Kip2 (CDKN1C), a negative regulator of proliferation and a tumor suppressor, is controlled by members of the NAP1 family, we tested the hypothesis that NAP1L1 may have a mechanistic role in regulating pancreatic NEN proliferation through regulation of p57Kip2.ResultsNAP1L1 silencing (siRNA and shRNA/lipofectamine approach) decreased proliferation through inhibition of mechanistic (mammalian) target of rapamycin pathway proteins and their phosphorylation (p < 0.05) in the pancreatic neuroendocrine neoplasm cell line BON in vitro (p < 0.0001) and resulted in significantly smaller (p < 0.05) and lighter (p < 0.05) tumors in the orthotopic pancreatic NEN mouse model. Methylation of the p57 Kip2 promoter was decreased by NAP1L1 silencing (p < 0.05), and expression of p57Kip2 (transcript and protein) was upregulated. For methylation of the p57 Kip2 promoter, NAP1L1 bound directly to the promoter (−164 to +21, chromatin immunoprecipitation). In 43 pancreatic NEN samples (38 primaries and 5 metastasis), NAP1L1 was over-expressed in metastasis (p < 0.001), expression which was inversely correlated with p57Kip2 (p < 0.01) on mRNA and protein levels. Menin was not differentially expressed.ConclusionNAP1L1 is over-expressed in pancreatic neuroendocrine neoplasm metastases and epigenetically promotes cell proliferation through regulation of p57 Kip2 promoter methylation.
BackgroundThe Hedgehog (HH) pathway is a mediator in pancreatic ductal adenocarcinoma (PDAC). Surprisingly, previous studies suggested that primary cilia (PC), the essential organelles for HH signal transduction, were lost in PDAC. The aim of this study was to determine the presence of PC in human normal pancreas, chronic pancreatitis, and during carcinogenesis to PDAC with focus on both epithelia and stroma.MethodsPC were analyzed in paraffin sections from normal pancreas, chronic pancreatitis, intraductal papillary-mucinous neoplasia, and PDAC, as well as in primary human pancreatic stellate cells (PSC) and pancreatic cancer cell lines by double immunofluorescence staining for acetylated α-tubuline and γ-tubuline. Co-staining for the HH receptors PTCH1, PTCH2 and SMO was also performed.ResultsPC are gradually lost during pancreatic carcinogenesis in the epithelium: the fraction of cells with PC gradually and significantly decreased from 32% in ducts of normal pancreas, to 21% in ducts of chronic pancreatitis, to 18% in PanIN1a, 6% in PanIN2, 3% in PanIN3 and to 1.2% in invasive PDAC. However, this loss of PC in the neoplastic epithelium is accompanied by a gain of PC in the surrounding stroma. The fraction of stromal cells with PC significantly increased from 13% around normal ducts to about 30% around PanIN and PDAC. HH-receptors were detected in tumor stroma but not in epithelial cells. PC are also present in PSC and pancreatic cancer cell lines.ConclusionPC are not lost during pancreatic carcinogenesis but re-distributed from the epithelium to the stroma. This redistribution may explain the re-direction of HH signaling towards the stroma during pancreatic carcinogenesis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.