Cloud computing and deep learning, the recent big trends in the software industry, have enabled small companies to scale their business up rapidly. However, this growth is not without a costdeep learning models are related to the heaviest workloads in cloud data centers. When the business grows, the monetary cost of deep learning in the cloud grows fast as well. Deep learning practitioners should be prepared and equipped to limit the growing cost. We performed a systematic literature review on the methods to control the monetary cost of deep learning. Our library search resulted in 16066 papers from three article databases, IEEE Xplore, ACM Digital Library, and Scopus. We narrowed them down to 112 papers that we categorized and summarized. We found that: 1) Optimizing inference has raised more interest than optimizing training. Popular deep learning libraries already support some of the inference optimization methods such as quantization, pruning, and teacher-student. 2) The research has been centered around image inputs, and there seems to be a research gap for other types of inputs. 3) The research has been hardwareoriented, and the most typical approach to control the cost of deep learning is based on algorithm-hardware co-design. 4) Offloading some of the processing to client devices is gaining interest and has the potential to reduce the monetary cost of deep learning.