Letting the Genie Out of the Lamp: Using Natural Language Processing Tools to Predict Math Performance

Crossley, Scott A.; Kostyuk, Victor

doi:10.1007/978-3-319-59888-8_28

Cited by 6 publications

(6 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…While education researchers have long argued that off-topic conversation takes time away from learning (Carroll, 1963), there is evidence that small talk is associated with more effective collaboration in human-human learning (Kreijns, 2004). Similar rapport has been created by conversational agents (Crossley & Kostyuk, 2017). In our work we employed small talk along with recommendations to gently nudge the user into content.…”

Section: Analyses Of Small Talkmentioning

confidence: 99%

“…As seen in systems that use wizard of oz approaches to generate small talk (e.g. Crossley & Kostyuk, 2017), students develop social relationships with the system, explicitly asking Curio SmartChat questions about its family, friends and hobbies. When a question is beyond the capacity of Curio SmartChat to answer, a default response-"Please ask me about middle school topics in Science" is provided.…”

Section: Analyses Of Small Talkmentioning

confidence: 99%

See 1 more Smart Citation

Curio SmartChat : A system for Natural Language Question Answering for Self-Paced K-12 Learning

Raamadhurai¹,

Baker

Poduval³

2019

Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

View full text Add to dashboard Cite

During learning, students often have questions which they would benefit from responses to in real time. In class, a student can ask a question to a teacher. During homework, or even in class if the student is shy, it can be more difficult to receive a rapid response. In this work, we introduce Curio SmartChat, an automated question answering system for middle school Science topics. Our system has now been used by around 20,000 students who have so far asked over 100,000 questions. We present data on the challenge created by students' grammatical errors and spelling mistakes, and discuss our system's approach and degree of effectiveness at disambiguating questions that the system is initially unsure about. We also discuss the prevalence of student "small talk" not related to science topics, the pluses and minuses of this behavior, and how a system should respond to these conversational acts. We conclude with discussions and point to directions for potential future work.

show abstract

Section: Analyses Of Small Talkmentioning

confidence: 99%

Section: Analyses Of Small Talkmentioning

confidence: 99%

Curio SmartChat : A system for Natural Language Question Answering for Self-Paced K-12 Learning

Raamadhurai¹,

Baker

Poduval³

2019

Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

View full text Add to dashboard Cite

show abstract

“…The length of patients’ aggregated SMs ranged from 1 word and 16 469 words, with a mean length of 2058.95 words. To provide appropriate linguistic coverage to develop literacy profiles, we excluded patients whose aggregated secure messages lacked sufficient words (<50 words, see Figure 1), a threshold based on previous NLP text research in learning analytics domains 19,20 …”

Section: Methodsmentioning

confidence: 99%

“…To provide appropriate linguistic coverage to develop literacy profiles, we excluded patients whose aggregated secure messages lacked sufficient words (<50 words, see Figure 1), a threshold based on previous NLP text research in learning analytics domains. 19,20 This study was approved by the KPNC and UCSF Institutional Review Boards (IRBs). All analyses involved secondary data and all data were housed on a password-protected secure KPNC server that could only be accessed by authorized researchers.…”

Section: What This Study Addsmentioning

confidence: 99%

Employing computational linguistics techniques to identify limited patient health literacy: Findings from the ECLIPPSE study

Schillinger

Balyan

Crossley

et al. 2020

Health Services Research

View full text Add to dashboard Cite

Objective To develop novel, scalable, and valid literacy profiles for identifying limited health literacy patients by harnessing natural language processing. Data Source With respect to the linguistic content, we analyzed 283 216 secure messages sent by 6941 diabetes patients to physicians within an integrated system's electronic portal. Sociodemographic, clinical, and utilization data were obtained via questionnaire and electronic health records. Study Design Retrospective study used natural language processing and machine learning to generate five unique “Literacy Profiles” by employing various sets of linguistic indices: Flesch‐Kincaid (LP_FK); basic indices of writing complexity, including lexical diversity (LP_LD) and writing quality (LP_WQ); and advanced indices related to syntactic complexity, lexical sophistication, and diversity, modeled from self‐reported (LP_SR), and expert‐rated (LP_Exp) health literacy. We first determined the performance of each literacy profile relative to self‐reported and expert‐rated health literacy to discriminate between high and low health literacy and then assessed Literacy Profiles’ relationships with known correlates of health literacy, such as patient sociodemographics and a range of health‐related outcomes, including ratings of physician communication, medication adherence, diabetes control, comorbidities, and utilization. Principal Findings LP_SR and LP_Exp performed best in discriminating between high and low self‐reported (C‐statistics: 0.86 and 0.58, respectively) and expert‐rated health literacy (C‐statistics: 0.71 and 0.87, respectively) and were significantly associated with educational attainment, race/ethnicity, Consumer Assessment of Provider and Systems (CAHPS) scores, adherence, glycemia, comorbidities, and emergency department visits. Conclusions Since health literacy is a potentially remediable explanatory factor in health care disparities, the development of automated health literacy indicators represents a significant accomplishment with broad clinical and population health applications. Health systems could apply literacy profiles to efficiently determine whether quality of care and outcomes vary by patient health literacy; identify at‐risk populations for targeting tailored health communications and self‐management support interventions; and inform clinicians to promote improvements in individual‐level care.

show abstract

“…Patients whose aggregated SMs lacked sufficient words (<50 words) to provide linguistic coverage were removed. Our 50-word threshold was based on previous NLP text analyses in learning analytics domains [61–62]. The final cleaned data consisted of 6,941 patients and 283,216 SMs.…”

Section: Methodsmentioning

confidence: 99%

Using natural language processing and machine learning to classify health literacy from secure messages: The ECLIPPSE study

et al. 2019

View full text Add to dashboard Cite

Limited health literacy is a barrier to optimal healthcare delivery and outcomes. Current measures requiring patients to self-report limitations are time-consuming and may be considered intrusive by some. This makes widespread classification of patient health literacy challenging. The objective of this study was to develop and validate “literacy profiles” as automated indicators of patients’ health literacy to facilitate a non-intrusive, economic and more comprehensive characterization of health literacy among a health care delivery system’s membership. To this end, three literacy profiles were generated based on natural language processing (combining computational linguistics and machine learning) using a sample of 283,216 secure messages sent from 6,941 patients to their primary care physicians. All patients were participants in Kaiser Permanente Northern California’s DISTANCE Study. Performance of the three literacy profiles were compared against a gold standard of patient self-reported health literacy. Associations were analyzed between each literacy profile and patient demographics, health outcomes and healthcare utilization. T-tests were used for numeric data such as A1C, Charlson comorbidity index and healthcare utilization rates, and chi-square tests for categorical data such as sex, race, poor adherence and severe hypoglycemia. Literacy profiles varied in their test characteristics, with C-statistics ranging from 0.61–0.74. Relations between literacy profiles and health outcomes revealed patterns consistent with previous health literacy research: patients identified via literacy profiles indicative of limited health literacy: (a) were older and more likely of minority status; (b) had poorer medication adherence and glycemic control; and (c) exhibited higher rates of hypoglycemia, comorbidities and healthcare utilization. This represents the first successful attempt to employ natural language processing to estimate health literacy. Literacy profiles can offer an automated and economical way to identify patients with limited health literacy and greater vulnerability to poor health outcomes.

show abstract

Letting the Genie Out of the Lamp: Using Natural Language Processing Tools to Predict Math Performance

Cited by 6 publications

References 24 publications

Curio SmartChat : A system for Natural Language Question Answering for Self-Paced K-12 Learning

Curio SmartChat : A system for Natural Language Question Answering for Self-Paced K-12 Learning

Employing computational linguistics techniques to identify limited patient health literacy: Findings from the ECLIPPSE study

Using natural language processing and machine learning to classify health literacy from secure messages: The ECLIPPSE study

Contact Info

Product

Resources

About