Human Activity Recognition requires very high accuracy to be effectively employed into practical applications, ranging from elderly care to microsurgical devices. The highest accuracies are achieved by Deep Learning models, but these are not easily deployable in handheld or wearable devices with very constrained resources. We therefore present a new HAR system suitable for a compact FPGA implementation. A new Binarized Neural Network (BNN) architecture achieves the classification based on data from a single tri-axial accelerometer. From our experiments, the effect of gravity and the unknown orientation of the sensor cause a degradation of the accuracy. In order to compensate for these issues, we propose a HW-friendly algorithm to pre-process the raw acceleration signal. Moreover, the very low power and hardware friendly BNN has been trained and validated on the PAMAP2 dataset, for which the pre-processing operations increase the accuracy from 51% to 99% in the best case. Aiming for a low-power design, we designed both a custom circuit to perform the pre-processing operations and a hardware accelerator for the BNN. The design on FPGA features a power dissipation of 72 mW and occupies 6788 LUTs.