We study a prediction+optimisation formulation of the knapsack problem. The goal is to predict the profits of knapsack items based on historical data, and afterwards use these predictions to solve the knapsack. The key is that the item profits are not known beforehand and thus must be estimated, but the quality of the solution is evaluated with respect to the true profits. We formalise the problem, the goal of minimising expected regret and the learning problem, and investigate different machine learning approaches that are suitable for the optimisation problem. Recent methods for linear programs have incorporated the linear relaxation directly into the loss function. In contrast, we consider less intrusive techniques of changing the loss function, such as standard and multi-output regression, and learning-to-rank methods. We empirically compare the approaches on real-life energy price data and synthetic benchmarks, and investigate the merits of the different approaches.Combinatorial optimisation is crucial in today's society and used throughout many industries. In this paper, we work with the fundamental knapsack problem, which has been studied for over a century and is well understood [17,9]. It is studied in fields such as combinatorics, computer science, complexity theory, cryptography, and applied mathematics. It has numerous applications, including resource allocation problems where the aim is to select as many resources as possible under given financial constraints. The knapsack problem is NP-hard, though highly efficient solution methods exist for reasonably sized instances [9].In traditional optimisation, it is assumed that all parameters, e.g. the profits and weights in a knapsack, are precisely known beforehand. In practice, these are often crude estimates based on domain expertise or historic data. As we enter the age of big data, large amounts of data is available and thus parameters can be estimated with greater precision. For example, ongoing promotion and current weather might influence the demand. The question that arises is whether such contextual data, together with historical data, can be used to improve decision making, i.e. solve the underlying optimisation problem more effectively.Such problems are encountered in load shifting [10], where the aim is to create an energy-aware day-head schedule based on predicted hourly energy prices.