Abstract-In this paper, the problem of age estimation is addressed based on two modalities: speech utterances and speakers' face images. The proposed age estimation framework employs the Shifted Covariates REgression Analysis for Multiway data (SCREAM) model, which combines Parallel Factor Analysis 2 and Principal Covariates Regression. SCREAM is able to extract a few latent variables from multi-way data and compute regression coefficients. Initially, biologically inspired features are extracted from speech utterances and face images and are suitable feature matrices are created to be fed to the multi-way SCREAM model. For bimodal age estimation, the visual and aural features are appropriately combined in a single matrix for each person. Experimental results demonstrate the profit of combining the two modalities. The performance admitted by the multi-way regression for age estimation is also measured on the benchmark face image dataset FG-NET. The proposed method is found to be competitive to state-of-the-art age estimation methods.