Modelling electroglottographic data with wavegrams and generalised additive mixed models

Coretta, Stefano

doi:10.31219/osf.io/m623d

Cited by 15 publications

(15 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We applied smooth functions with thin plate basis splines (bs = "tp") to "Year" by "Species" and "Delta Outflow" by "Species," and "Species" was included as a linear fixed effect. Finally, we printed generalized additive model results in tabular format and generated plots of yearly COG point estimates, 95% confidence intervals around the point estimates, the fit trendline and 95% confidence interval by species from the generalized additive model, and the spline effect of Delta outflow in million-acre feet on COG (Wickham 2016;Coretta 2022).…”

Section: Discussionmentioning

confidence: 99%

A Spatiotemporal History of Key Pelagic Fish Species in the San Francisco Estuary, CA

Stompe

Moyle

Oken

et al. 2023

Estuaries and Coasts

View full text Add to dashboard Cite

Estuaries across the globe have been subject to extensive abiotic and biotic changes and are often monitored to track trends in species abundance. The San Francisco Estuary has been deeply altered by anthropogenic factors, which is reflected in substantial declines in some native and introduced fishes. To track trends in fish abundance, a multitude of monitoring programs have conducted regular fish surveys, some dating back to the late 1950s. While these surveys are all designed to track population-scale changes in fish abundance, they are methodologically distinct, with different target species, varying spatial coverage and sampling frequency, and different gear types. To compensate for individual survey limitations, we modeled pelagic fish distributions with integrated data from many sampling programs. We fit binomial generalized linear mixed models with spatial and spatiotemporal random effects to map annual trends in the spatially explicit detection probabilities of striped bass, Delta smelt, longfin smelt, threadfin shad, and American shad for the years 1980 to 2017. Overall, detection probability has declined by approximately 50% for striped bass and is now near zero for the two smelt species, while threadfin shad and American shad have both experienced fluctuations with only slightly reduced detection probabilities by 2017. Detection probabilities decreased dramatically for these fishes in the Central and South Delta, especially after the year 2000. In contrast, Suisun Marsh and the North Delta acted as refuge habitats with reduced levels of decline or even increased detection probabilities for some species. Our modeling approach, using disparate datasets, demonstrates the simultaneous spatially driven decline of pelagic fish species in a highly altered estuary.

show abstract

Section: Discussionmentioning

confidence: 99%

A Spatiotemporal History of Key Pelagic Fish Species in the San Francisco Estuary, CA

Stompe

Moyle

Oken

et al. 2023

Estuaries and Coasts

View full text Add to dashboard Cite

show abstract

“…Splines were applied to each biomarker to account for nonlinearity [ 22 ]. Random intercepts at level 3 accounted for differences between individual calls, and a binomial model with logit link was used to identify imminent risk speech frames in terms of the level 1 and 2 voice biomarkers.…”

Section: Methodsmentioning

confidence: 99%

Using Voice Biomarkers to Classify Suicide Risk in Adult Telehealth Callers: Retrospective Observational Study

Iyer¹,

Nedeljkovic²,

Meyer³

2022

JMIR Ment Health

View full text Add to dashboard Cite

Background Artificial intelligence has the potential to innovate current practices used to detect the imminent risk of suicide and to address shortcomings in traditional assessment methods. Objective In this paper, we sought to automatically classify short segments (40 milliseconds) of speech according to low versus imminent risk of suicide in a large number (n=281) of telephone calls made to 2 telehealth counselling services in Australia. Methods A total of 281 help line telephone call recordings sourced from On The Line, Australia (n=266, 94.7%) and 000 Emergency services, Canberra (n=15, 5.3%) were included in this study. Imminent risk of suicide was coded for when callers affirmed intent, plan, and the availability of means; level of risk was assessed by the responding counsellor and reassessed by a team of clinical researchers using the Columbia Suicide Severity Rating Scale (=5/6). Low risk of suicide was coded for in an absence of intent, plan, and means and via Columbia suicide Severity Scale Ratings (=1/2). Preprocessing involved normalization and pre-emphasis of voice signals, while voice biometrics were extracted using the statistical language r. Candidate predictors were identified using Lasso regression. Each voice biomarker was assessed as a predictor of suicide risk using a generalized additive mixed effects model with splines to account for nonlinearity. Finally, a component-wise gradient boosting model was used to classify each call recording based on precoded suicide risk ratings. Results A total of 77 imminent-risk calls were compared with 204 low-risk calls. Moreover, 36 voice biomarkers were extracted from each speech frame. Caller sex was a significant moderating factor (β=–.84, 95% CI –0.85, –0.84; t=6.59, P<.001). Candidate biomarkers were reduced to 11 primary markers, with distinct models developed for men and women. Using leave-one-out cross-validation, ensuring that the speech frames of no single caller featured in both training and test data sets simultaneously, an area under the precision or recall curve of 0.985 was achieved (95% CI 0.97, 1.0). The gamboost classification model correctly classified 469,332/470,032 (99.85%) speech frames. Conclusions This study demonstrates an objective, efficient, and economical assessment of imminent suicide risk in an ecologically valid setting with potential applications to real-time assessment and response. Trial Registration Australian New Zealand Clinical Trials Registry ACTRN12622000486729; https://www.anzctr.org.au/ACTRN12622000486729.aspx

show abstract

“…We then rescaled the word counts to get the log2 frequency of occurrences per 1 billion words, so higher values indicate higher log frequencies. We got per-word surprisals for each of 4 different language models, covering a range of common architectures: a Kneser-Ney smoothed 5-gram; the long short-term memory recurrent neural network model of Gulordava et al (2018), which we refer to as GRNN; Transformer-XL (Dai et al, 2019); and GPT-2 (Radford et al, n.d.), using 2 We, furthermore, used the R-packages bookdown (Version 0.29; Xie, 2016), brms (Version 2.18.0; Bürkner, 2017Bürkner, , 2018Bürkner, , 2021, broom.mixed (Version 0.2.9.4; Bolker & Robinson, 2022), cowplot (Version 1.1.1; Wilke, 2020), gridExtra (Version 2.3; Auguie, 2017), here (Version 1.0.1; Müller, 2020), kableExtra (Version 1.3.4; Zhu, 2021), lme4 (Version 1.1.31; Bates et al, 2015), mgcv (Wood, 2003(Wood, , 2004Version 1.8.41;Wood, 2011;Wood et al, 2016), mgcViz (Version 0.1.9; Fasiolo et al, 2018), papaja (Version 0.1.1; Aust & Barth, 2022), patchwork (Version 1.1.2; Pedersen, 2022), rticles (Version 0.24.4; Allaire et al, 2022), tidybayes (Version 3.0.2; Kay, 2022), tidymv (Version 3.3.2; Coretta, 2022), and tidyverse (Version 1.3.2; Wickham et al, 2019). lm-zoo (Gauthier et al, 2020).…”

Section: Predictorsmentioning

confidence: 99%

A-maze of Natural Stories: Comprehension and surprisal in the Maze task

Boyce¹,

Lévy²

2023

Glossa Psycholinguistics

View full text Add to dashboard Cite

Behavioral measures of word-by-word reading time provide experimental evidence to test theories of language processing. A-maze is a recent method for measuring incremental sentence processing that can localize slowdowns related to syntactic ambiguities in individual sentences. We adapted A-maze for use on longer passages and tested it on the Natural Stories corpus. Participants were able to comprehend these longer text passages that they read via the Maze task. Moreover, the Maze task yielded useable reaction time data with word predictability effects that were linearly related to surprisal, the same pattern found with other incremental methods. Crucially, Maze reaction times show a tight relationship with properties of the current word, with little spillover of eﬀects from previous words. This superior localization is an advantage of Maze compared with other methods. Overall, we expanded the scope of experimental materials, and thus theoretical questions, that can be studied with the Maze task.

show abstract

Modelling electroglottographic data with wavegrams and generalised additive mixed models

Cited by 15 publications

References 16 publications

A Spatiotemporal History of Key Pelagic Fish Species in the San Francisco Estuary, CA

A Spatiotemporal History of Key Pelagic Fish Species in the San Francisco Estuary, CA

Using Voice Biomarkers to Classify Suicide Risk in Adult Telehealth Callers: Retrospective Observational Study

A-maze of Natural Stories: Comprehension and surprisal in the Maze task

Contact Info

Product

Resources

About