Human voice is an extremely important biological signal which contains information about sex, age, emotional state, health and physical features of a speaker. Estimating a physical appearance from a vocal cue can be an important asset for sciences including forensics and dietetics. Although there have been several studies focused on the relationships between vocal parameters and ratings of height, weight, age and musculature of a speaker, to our knowledge, there has not been a study examining the assessment of one’s BMI based on voice alone.
The purpose of the current study was to determine the ability of female “Judges” to evaluate speakers’ (men and women) obesity and body fat distribution from their vocal cues. It has also been checked which voice parameters are key vocal cues in this assessment.
The study material consisted of 12 adult speakers’ (6 women) voice recordings assessed by 87 “Judges” based on a 5-point graphic scale presenting body fat level and distribution (separately for men and women). For each speaker body height, weight, BMI, Visceral Fat Level (VFL, InBody 270) and acoustic parameters were measured. In addition, the accuracy of BMI category was verified. This study also aimed to determine which vocal parameters were cues for the assessment for men and women. To achieve it, two independent experiments were conducted: I: “Judges” had to choose one (obese) speaker from 3 voices (in 4 series); II: they were asked to rate body fat level of the same 12 speakers based on 5-point graphic scale.
Obese speakers (i.e., BMI above 30) were selected correctly with the accuracy greater than predicted by chance (experiment I). By using a graphic scale, our study found that speakers exhibiting higher BMI were rated as fatter (experiment II). For male speakers the most important vocal predictors of the BMI were harmonics-to-noise ratio (HNR) and formant dispersion (Df); for women: formant spacing (Pf) and intensity (loudness).
Human voice contains information about one’s increased BMI level which are hidden in some vocal cues.