Detecting Process Duration Drift Using Gamma Mixture Models in a Left-Truncated and Right-Censored Environment

Yang, Lingkai; McClean, Sally; Donnelly, Mark; Khan, Kashaf; Burke, Kevin

doi:10.1145/3669942

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article3

Book1

Relationship

Self Cite0

Independent4

Authors

Journals

Cited by 4 publications

References 50 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Using Semi-Markov Models for Generating, Validating, and Analyzing Artificial Smart Home Processes

McClean,

Wang,

Yang

et al. 2024

Lecture Notes in Networks and Systems

View full text Add to dashboard Cite

Using Semi-Markov Models for Generating, Validating, and Analyzing Artificial Smart Home Processes

McClean,

Wang,

Yang

et al. 2024

Lecture Notes in Networks and Systems

View full text Add to dashboard Cite

Gamma-mixture Bayesian method for anomalous coalmine pressure analysis

Yang,

Cheng,

Luo

et al. 2024

Memetic Comp.

View full text Add to dashboard Cite

Larger and more instructable language models become less reliable

Zhou,

Schellaert,

Martínez-Plumed

et al. 2024

Nature

View full text Add to dashboard Cite

The prevailing methods to make large language models more powerful and amenable have been based on continuous scaling up (that is, increasing their size, data volume and computational resources1) and bespoke shaping up (including post-filtering2,3, fine tuning or use of human feedback4,5). However, larger and more instructable large language models may have become less reliable. By studying the relationship between difficulty concordance, task avoidance and prompting stability of several language model families, here we show that easy instances for human participants are also easy for the models, but scaled-up, shaped-up models do not secure areas of low difficulty in which either the model does not err or human supervision can spot the errors. We also find that early models often avoid user questions but scaled-up, shaped-up models tend to give an apparently sensible yet wrong answer much more often, including errors on difficult questions that human supervisors frequently overlook. Moreover, we observe that stability to different natural phrasings of the same question is improved by scaling-up and shaping-up interventions, but pockets of variability persist across difficulty levels. These findings highlight the need for a fundamental shift in the design and development of general-purpose artificial intelligence, particularly in high-stakes areas for which a predictable distribution of errors is paramount.

show abstract

Detecting Process Duration Drift Using Gamma Mixture Models in a Left-Truncated and Right-Censored Environment

Cited by 4 publications

References 50 publications

Using Semi-Markov Models for Generating, Validating, and Analyzing Artificial Smart Home Processes

Using Semi-Markov Models for Generating, Validating, and Analyzing Artificial Smart Home Processes

Gamma-mixture Bayesian method for anomalous coalmine pressure analysis

Larger and more instructable language models become less reliable

Contact Info

Product

Resources

About