High Performance Computing (HPC) clouds are becoming an alternative to on-premise clusters for executing scientific applications and business analytics services. Most research efforts in HPC cloud aim to understand the cost-benefit of moving resource-intensive applications from on-premise environments to public cloud platforms. Industry trends show hybrid environments are the natural path to get the best of the on-premise and cloud resources-steady (and sensitive) workloads can run on on-premise resources and peak demand can leverage remote resources in a pay-as-you-go manner. Nevertheless, there are plenty of questions to be answered in HPC cloud, which range from how to extract the best performance of an unknown underlying platform to what services are essential to make its usage easier. Moreover, the discussion on the right pricing and contractual models to fit small and large users is relevant for the sustainability of HPC clouds. This paper brings a survey and taxonomy of efforts in HPC cloud and a vision on what we believe is ahead of us, including a set of research challenges that, once tackled, can help advance businesses and scientific discoveries. This becomes particularly relevant due to the fast increasing wave of new HPC applications coming from big data and artificial intelligence. users have no visibility or concerns on costs of running jobs. However, large clusters do incur expenses and, when not properly managed, can generate resource wastage and poor quality of service.Motivated by the different utilization levels of clusters around the globe and by the need to run even larger parallel programs, in the early 2000s, Grid Computing became relevant for the HPC community. Grids offer users access to powerful resources managed by autonomous administrative domains [50,51]. The notion of monetary costs for running applications was soft, favoring a more collaborative model of resource sharing. Therefore, quality of service was not strict in Grids, having users relying on best-effort policies to run applications.In the late 2000s, cloud computing [8,26,91] was quickly increasing its maturity level and popularity, and studies started to emerge on the viability of executing HPC applications on remote cloud resources. These applications, which consume more resources than traditional cloud applications and usually are executed in batches rather than 24x7 services, range from parallel applications written in Message Passing Interface (MPI) [58,59] to the newest big data [11,14,39,101] and artificial intelligence applications-the latter mostly relying on deep learning [34,80]. Cloud then came up as an evolution of a series of technologies, mainly on virtualization and computer networks, which facilitated both workload management and interaction with remote resources respectively. Apart from software and hardware, cloud offers a business model where users pay for resources on demand. Compared to traditional HPC environments, in clouds users can quickly adjust their resource pools, via a mechanism known as elast...