The online bin packing problem is a well-known bin packing variant which requires immediate decisions to be made for the placement of a lengthy sequence of arriving items of various sizes one at a time into fixed capacity bins without any overflow. The overall goal is maximising the average bin fullness. We investigate a 'policy matrix' representation which assigns a score for each decision option independently and the option with the highest value is chosen for one dimensional online bin packing. A policy matrix might also be considered as a heuristic with many parameters, where each parameter value is a score.We hence investigate a framework which can be used for creating heuristics via many parameters. The proposed framework combines a Genetic Algorithm optimiser, which searches the space of heuristics in policy matrix form, and an online bin packing simulator, which acts as the evaluation function. The empirical results indicate the success of the proposed approach, providing the best solutions for almost all item sequence generators used during the experiments.We also present a novel fitness landscape analysis on the search space of policies. This study hence gives evidence of the potential for automated discovery by intelligent systems of powerful heuristics for online problems; reducing the need for expensive use of human expertise.