Data produced by wearable sensors is key in contexts such as performance enhancement and training help for sports and fitness, continuous monitoring for aging people and for chronic disease management, and in gaming and entertainment. Unfortunately, wearable devices currently on the market are either incapable of complex functionality or severely impaired by short battery lifetime. In this work, we present a smartwatch platform based on an ultra-low power (ULP) heterogeneous system composed by a TI MSP430 microcontroller, the PULP programmable parallel accelerator and a set of ULP sensors, including a camera. The embedded PULP accelerator enables state-of-the-art context classification based on Convolutional Neural Networks (CNNs) to be applied within a sub-10mW system power envelope. Our methodology enables to reach high accuracy in context classification over 5 classes (up to 84%, with 3 classes over 5 reaching more than 90% accuracy), while consuming 2.2mJ per classification, or an ultra-low energy consumption of less than 91uJ per classification with an accuracy of 64%-3.2× better than chance. Our results suggest that the proposed heterogeneous platform can provide up to 500× speedup with respect to the MSP430 within a similar power envelope, which would enable complex computer vision algorithms to be executed in highly power-constrained scenarios.