Speaker Recognition is a multi-disciplinary branch of biometrics that may be used for identification, verification, and classification of individual speakers, with the capability of tracking, detection, and segmentation by extension. Recently, a comprehensive book on all aspects of speaker recognition was published [1]. Therefore, here we are not concerned with details of the standard modeling which is and has been used for the recognition task. In contrast, we present a review of the most recent literature and briefly visit the latest techniques which are being deployed in the various branches of this technology.Most of the works being reviewed here have been published in the last two years. Some of the topics, such as alternative features and modeling techniques, are general and apply to all branches of speaker recognition. Some of these general techniques, such as whispered speech, are related to the advanced treatment of special forms of audio which have not received ample attention in the past. Finally, we will follow by a look at advancements which apply to specific branches of speaker recognition [1], such as verification, identification, classification, and diarization. This chapter is meant to complement the summary of speaker recognition, presented in [2], which provided an overview of the subject. It is also intended as an update on the methods described in [1]. In the next section, for the sake of completeness, a brief history of speaker recognition is presented, followed by sections on specific progress as stated above, for globally applicable treatment and methods, as well as techniques which are related to specific branches of speaker recognition.
A brief historyThe topic of speaker recognition [1] has been under development since the mid-twentieth century. The earliest known papers on the subject, published in the 1950s [3,4], were in search of finding personal traits of the speakers, by analyzing their speech, with some statistical underpinning. With the advent of early communication networks, Pollack, et al. [3] noted the need for speaker identification. Although, they employed human listeners to do the identification of individuals and studied the importance of the duration of speech and other facets that help in the recognition of a speaker. In most of the early