Use policyThe full-text may be used and/or reproduced, and given to third parties in any format or medium, without prior permission or charge, for personal research or study, educational, or not-for-prot purposes provided that:• a full bibliographic reference is made to the original source • a link is made to the metadata record in DRO • the full-text is not changed in any way The full-text must not be sold in any format or medium without the formal permission of the copyright holders.Please consult the full DRO policy for further details.
ABSTRACTWhen performing a trace-driven simulation of a High Throughput Computing system we are limited to the knowledge which should be available to the system at the current point within the simulation. However, the trace-log contains information we would not be privy to during the simulation. Through the use of Machine Learning we can extract the latent patterns within the trace-log allowing us to accurately predict characteristics of tasks based only on the information we would know. These characteristics will allow us to make better decisions within simulations allowing us to derive better policies for saving energy. We demonstrate that we can accurately predict (up-to 99% accuracy), using oversampling and deep learning, those tasks which will complete while at the same time provide accurate predictions for the task execution time and memory footprint using Random Forest Regression.