“…Depending on the need to exploit one or the other of these distinguished properties, the Bregman distances or Csiszár divergences are preferred, and both of them are widely applied in important areas of information theory, statistics and computer science, for example in (Ai) information retrieval (see, e.g., Do and Vetterli (2002), Hertz at al. (2004)), (Aii) optimal decision (for general decision see, e.g., Boratynska (1997), Freund et al (1997), Bartlett et al (2006), Vajda and Zvárová (2007), for speech processing see, e.g., Carlson and Clements (1991), Veldhuis and Klabers (2002), for image processing see, e.g., Xu and Osher (2007), Marquina and Osher (2008), Scherzer et al (2008)), and (Aiii) machine learning (see, e.g., Laferty (1999), Banerjee et al (2005), Amari (2007), Teboulle (2007), Nock and Nielsen (2009)).…”