A significant amount of research effort is put into studying machine learning (ML) and deep learning (DL) technologies. Real-world ML applications help companies to improve products and automate tasks such as classification, image recognition and automation. However, a traditional "fixed" approach where the system is frozen before deployment leads to a sub-optimal system performance. Systems autonomously experimenting with and improving their own behavior and performance could improve business outcomes but we need to know how this could actually work in practice. While there is some research on autonomously improving systems, the focus on the concepts and theoretical algorithms. However, less research is focused on empirical industry validation of the proposed theory. Empirical validations are usually done through simulations or by using synthetic or manually alteration of datasets. The contribution of this paper is twofold. First, we conduct a systematic literature review in which we focus on papers describing industrial deployments of autonomously improving systems and their real-world applications. Secondly, we identify open research questions and derive a model that classifies the level of autonomy based on our findings in the literature review.