2024
DOI: 10.1021/acs.jchemed.4c00058
|View full text |Cite
|
Sign up to set email alerts
|

Comment on “Comparing the Performance of College Chemistry Students with ChatGPT for Calculations Involving Acids and Bases”

Joshua Schrier

Abstract: In a recent paper in this Journal (J. Chem. Educ. 2023, 100, 3934−3944), Clark et al. evaluated the performance of the GPT-3.5 large language model (LLM) on ten undergraduate pH calculation problems. They reported that GPT-3.5 gave especially poor results for salt and titration problems, returning the correct results only 10% and 0% of the time, respectively, and that, despite a correct application of heuristics, the LLM made mathematical errors and used flawed strategies. However, these problems are partiall… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
4
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 12 publications
0
4
0
Order By: Relevance
“…We made no attempt at prompt engineering or hyperparameter optimization of the fine-tuning process. External function calling of “tools” (e.g., performing numerical or thermodynamic calculations) can be combined with iterative chain-of-thought methods (“think step by step”) to further improve problem solving. ,, Finally, we expect continued advances in LLMs and fine-tuning methodologies to improve performance.…”
Section: Precursor Selectionmentioning
confidence: 99%
See 1 more Smart Citation
“…We made no attempt at prompt engineering or hyperparameter optimization of the fine-tuning process. External function calling of “tools” (e.g., performing numerical or thermodynamic calculations) can be combined with iterative chain-of-thought methods (“think step by step”) to further improve problem solving. ,, Finally, we expect continued advances in LLMs and fine-tuning methodologies to improve performance.…”
Section: Precursor Selectionmentioning
confidence: 99%
“…General purpose large language models (LLMs) are a form of generative artificial intelligence (AI), pretrained on a broad data set so they can be applied to many different tasks using natural language. Pretrained LLMs have been investigated for a wide variety of chemical tasks, , such as extracting structured data from the literature, writing numerical simulation software, and education . LLM-based workflows have been used to plan syntheses of organic molecules , and metal–organic frameworks (MOFs). , Recent work has benchmarked materials science , and general chemical knowledge of existing LLMs, and there are efforts to develop chemistry/materials-specific LLMs. , Fine-tuning LLMs on modest amounts of data improves performance for specific tasks, while still taking advantage of the general pretraining to provide basic symbol interpretation and output formatting guidance.…”
mentioning
confidence: 99%
“…In the Fall of 2023, my course used one of these AI tools (ChatGPT-4) as the predominant means for creating and refining data visualizations, and this manuscript discusses the approach used as well as its relative merits and limitations as well as student perceptions and outcomes. It should be mentioned that there is already significant discussion of the role of AI tools in chemical education, focusing on topics such as writing aids, lesson planning aids, , strengths and shortcomings for problem solving, concerns over academic integrity, and enhancing student learning activities. ,, However, to my knowledge, there are yet no articles discussing the use of AI to help teach the design of data visualizations. It is hoped the approach discussed below could be incorporated into other courses, thereby better preparing students to effectively communicate their science.…”
Section: Introductionmentioning
confidence: 99%
“…License: CC BY-NC-ND 4.0 thought methods ("think step by step") to further improve problem solving. 26,27,62 Finally, we expect continued advances in LLMs and fine-tuning methodologies to improve performance.…”
mentioning
confidence: 99%