Quantifying behavioural changes in depression using affective computing techniques is the first step in developing an objective diagnostic aid, with clinical utility, for clinical depression. As part of the AVEC 2013 Challenge, we present a multimodal approach for the Depression Sub-Challenge using a GMM-UBM system with three different kernels for the audio subsystem and Space Time Interest Points in a Bag-of-Words approach for the vision subsystem. These are then fused at the feature level to form the combined AV system. Key results include the strong performance of acoustic audio features and the bag-of-words visual features in predicting an individual's level of depression using regression. Interestingly, in the context of the small amount of literature on the subject, is that our feature level multimodal fusion technique is able to outperform both the audio and visual challenge baselines.
Robust screening of materials on the basis of structure–property–activity relationships to discover active photocatalysts is a highly sought out aspect of photocatalysis research. Recent advancements in machine learning offer considerable opportunities to evolve photocatalysts discovery practices. Machine learning has largely facilitated various areas of science and engineering, including heterogeneous catalysis, but adaptation of it in photocatalysis research is still at an elementary stage. The scarcity of consistent training data is a major bottleneck, and we foresee the integration of photocatalysis domain knowledge in mainstream machine learning protocols as a viable solution. Here, we present a holistic framework incorporating machine learning and domain knowledge to set directions toward accelerated discovery of solar photocatalysts. This Perspective begins with a discussion on domain knowledge available in photocatalysis which could potentially be leveraged to liaise with machine learning methods. Subsequently, we present prevalent machine learning practices in heterogeneous catalysis tailored to assist discovery of photocatalysts in a purely data-driven fashion. Lastly, we conceptualize various strategies for complementing data-driven machine learning with photocatalysis domain knowledge. The strategies involve the following: (i) integration of theoretical and prior empirical knowledge during the training of machine learning models; (ii) embedding the knowledge in feature space; and (iii) utilizing existing material databases to constrain machine learning predictions. The aforementioned human-in-loop framework (leveraging both human and machine intelligence) could possibly mitigate the lack of interpretability and reliability associated with data-driven machine learning and reinforce complex model architectures irrespective of data scarcity. The concept could also offer substantial benefits to photocatalysis informatics by promoting a paradigm shift away from the Edisonian approach.
The spectral and energy properties of speech have consistently been observed to change with a speaker's level of clinical depression. This has resulted in spectral and energy based features being a key component in many speech-based classification and prediction systems. However there has been no in-depth investigation into understanding how acoustic models of spectral features are affected by depression. This paper investigates the hypothesis that the effects of depression in speech manifest as a reduction in the spread of phonetic events in acoustic space as modelled by Gaussian Mixture Models (GMM) in combination with Mel Frequency Cepstral Coefficients (MFCC).Our investigation uses three measures of acoustic variability: Average Weighted Variance (AWV), Acoustic Movement (AM) and Acoustic Volume, which attempt to model depression specific acoustic variations (AWV and Acoustic Volume), or the trajectory of a speech in the acoustic space (AM).Within our analysis we present the Probabilistic Acoustic Volume (PAV) a novel method for robustly estimating Acoustic Volume using a Monte Carlo sampling of the feature distribution being * Corresponding Author modelled. We show that using an array of PAV points we gain insights into how the concentration of the feature vectors in the feature space changes with depression. Key results -found on two commonly used depression corpora -consistently indicate that as a speaker's level of depression increases there are statistically significantly reductions in both AWV (-0.44 ≤ r s ≤ -0.18 with p < .05) and AM (-0.26 ≤ r s ≤ -0.19 with p < .05) values, indicating a decrease in localised acoustic variance and smoothing in acoustic trajectory respectively. Further there are also statistically significant reductions (-0.32 ≤ r s ≤ -0.20 with p < .05) in Acoustic Volume measures and strong statistical evidence (-0.48 ≤ r s ≤ -0.23 with p < .05) that the MFCC feature space becomes more concentrated.Quantifying these effects is expected to be a key step towards building an objective classification or prediction system which is robust to many of the unwanted -in terms of depression analysissources of variability modulated into a speech signal.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.