Jack Barnett Leveson scite author profile

Jack Barnett Leveson

4Publications

16Citation Statements Received

100Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Bath, City, University of London

Publications

Order By: Most citations

Multi-Time-Scale Convolution for Emotion Recognition from Speech Audio Signals

Guizzo

Weyde

Leveson

2020

View full text Add to dashboard Cite

Robustness against temporal variations is important for emotion recognition from speech audio, since emotion is expressed through complex spectral patterns that can exhibit significant local dilation and compression on the time axis depending on speaker and context. To address this and potentially other tasks, we introduce the multi-time-scale (MTS) method to create flexibility towards temporal variations when analyzing time-frequency representations of audio data. MTS extends convolutional neural networks with convolution kernels that are scaled and re-sampled along the time axis, to increase temporal flexibility without increasing the number of trainable parameters compared to standard convolutional layers. We evaluate MTS and standard convolutional layers in different architectures for emotion recognition from speech audio, using 4 datasets of different sizes. The results show that the use of MTS layers consistently improves the generalization of networks of different capacity and depth, compared to standard convolution, especially on smaller datasets.

show abstract

Visual-to-auditory sensory substitution alters language asymmetry in both sighted novices and experienced visually impaired users

et al. 2020

View full text Add to dashboard Cite

Visual-to-auditory sensory substitution devices (SSDs) provide improved access to the visual environment for the visually impaired by converting images into auditory information. Research is lacking on the mechanisms involved in processing data that is perceived through one sensory modality, but directly associated with a source in a different sensory modality. This is important because SSDs that use auditory displays could involve binaural presentation requiring both ear canals, or monaural presentation requiring only onebut which ear would be ideal? SSDs may be similar to reading, as an image (printed word) is converted into sound (when read aloud). Reading, and language more generally, are typically lateralised to the left cerebral hemisphere. Yet, unlike symbolic written language, SSDs convert images to sound based on visuospatial properties, with the right cerebral hemisphere potentially having a role in processing such visuospatial data. Here we investigated whether there is a hemispheric bias in the processing of visual-to-auditory sensory substitution information and whether that varies as a function of experience and visual ability. We assessed the lateralization of auditory processing with two tests: a standard dichotic listening test and a novel dichotic listening test created using the auditory information produced by an SSD, The vOICe. Participants were tested either in the lab or online with the same stimuli. We did not find a hemispheric bias in the processing of visualto-auditory information in visually impaired, experienced vOICe users. Further, we did not find any difference between visually impaired, experienced vOICe users and sighted novices in the hemispheric lateralization of visual-to-auditory information processing. Although standard dichotic listening is lateralised to the left hemisphere, the auditory processing of images in SSDs is bilateral, possibly due to the increased influence of right hemisphere processing. Auditory SSDs might therefore be equally effective with presentation to either ear if a monaural, rather than binaural, presentation were necessary.

show abstract

Visual-to-auditory sensory substitution alters language asymmetry in both sighted novices and experienced visually impaired users

Proulx¹,

Brown²,

Esenkaya³

et al. 2019

Preprint

View full text Add to dashboard Cite

Visual-to-auditory sensory substitution devices (SSDs) provides improved access to the visual environment for the visually impaired by converting images into auditory information. Research is lacking on the mechanisms involved in processing data that is perceived through one sensory modality, but directly associated with a source in a different sensory modality. SSDs may be similar to reading, as an image (printed word) is converted into sound (when read aloud). Reading, and language more generally, are typically lateralised to the left cerebral hemisphere. Yet, unlike symbolic written language, SSDs convert images to sound based on visuospatial properties, with the right cerebral hemisphere potentially having a role in processing such visuospatial data. Here we investigated whether there is a hemispheric bias in the processing of visual-to-auditory sensory substitution information and whether that varies as a function of expertise and visual ability. We assessed the lateralisation of auditory processing with two tests: a standard dichotic listening test and a novel dichotic listening test created using the auditory information produced by an SSD, The vOICe. Although standard dichotic listening is lateralised to the left hemisphere, the auditory processing of images in SSDs is bilateral, possibly due to the increased influence of right hemisphere processing.

show abstract

Multi-Time-Scale Convolution for Emotion Recognition from Speech Audio Signals

Guizzo¹,

Weyde²,

Leveson³

2020

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.