In commemoration of the foundation of the Japanese Journal of Statistics and Data Science (JJSD), our third special feature focuses on relationships and collaborations between information theory and statistics. Development and expansion in research areas of information theory and statistics are drawing a great deal of attention. This special feature comprises ten contributions from mainly three research topics: divergence-based statistical inference, high-dimensional sparse learning, and combinatorial design.Divergence measure, an extension of distances over the set of probability distributions, is a transverse tool in mathematical sciences. In particular, the Kullback-Leibler divergence, or in other words, the relative entropy, is closely related to the maximum likelihood estimator in statistics and the code length in information theory (Kullback 1959;Cover and Thomas 2006). The information criterion (Akaike 1974) and Bayes coding (Clarke and Barron 2006) are interdisciplinary research topics based on the concept of divergences. Today, important classes of divergence measures such as Bregman divergences have been widely applied in data analysis and information sciences (Bregman 1967;Basu et al. 1998;Fujisawa and Eguchi 2008). The following five articles focus mainly on theoretical analysis and practical applications of divergence measures.- Machida and Takenouchi (2019) are concerned with non-negative matrix factorization (NMF), which is a typical means of feature extraction in the framework of unsupervised learning (Lee and Seung 2001). It is well known that the standard NMF algorithm is not robust against outlier noise. The authors propose robust NMF algorithms by combining statistical modeling of reconstruction and the -divergence.