Many modern applications, both scientific and commercial, are deployed to cloud environments and often employ multiple types of resources. That allows them to efficiently allocate only the resources which are actually needed to achieve their goals. However, in many workloads the actual usage of the infrastructure varies over time, which results in over-provisioning and unnecessarily high costs. In such cases, automatic resource scaling can provide significant cost savings by provisioning only the amount of resources which are necessary to support the current workload. Unfortunately, due to the complex nature of distributed systems, automatic scaling remains a challenge. Reinforcement learning domain has been recently a very active field of research. Thanks to combining it with Deep Learning, many newly designed algorithms improve the state of the art in many complex domains. In this paper we present the results of our attempt to use the recent advancements in Reinforcement Learning to optimize the cost of running a compute-intensive evolutionary process by automating the scaling of heterogeneous resources in a compute cloud environment. We describe the architecture of our system and present evaluation results. The experiments include autonomous management of a sample workload and a comparison of its performance to the traditional automatic threshold-based management approach. We also provide the details of training of the management policy using the proximal policy optimization algorithm. Finally, we discuss the feasibility to extend the presented approach to further scenarios.