In human-computer interaction, particularly in multimedia delivery, information is communicated to users sequentially, whereas users are capable of receiving information from multiple sources concurrently. This mismatch indicates that a sequential mode of communication does not utilise human perception capabilities as efficiently as possible. This article reports an experiment that investigated various speech-based (audio) concurrent designs and evaluated the comprehension depth of information by comparing comprehension performance across several different formats of questions (main/detailed, implied/stated). The results showed that users, besides answering the main questions, were also successful in answering the implied questions, as well as the questions that required detailed information, and that the pattern of comprehension depth remained similar to that seen to a baseline condition, where only one speech source was presented. However, the participants answered more questions correctly that were drawn from the main information, and performance remained low where the questions were drawn from detailed information. The results are encouraging to explore the concurrent methods further for communicating multiple information streams efficiently in human-computer interaction, including multimedia.