“…And w is the speaker and channel factor with a standard normal distribution N (0, I). However, it is known that for short duration SV, the statistics are not sufficient for reliable i-vector learning, which leads to degraded performance [15]. Furthermore, the generative models obtained via unsupervised learning methods may be improved with discriminative models, e.g.…”