Eating difficulties and the subsequent need for eating assistance are a prevalent issue within the elderly population. Besides, a poor diet is considered a confounding factor for developing chronic diseases and functional limitations. Driven by the above issues, this paper proposes a wrist-worn tri-axial accelerometer based food and drink intake recognition system. First, an adaptive segmentation technique is employed to identify potential eating and drinking gestures from the continuous accelerometer readings. A posteriori, a study upon the use of Convolutional Neural Networks for the recognition of eating and drinking gestures is carried out. This includes the employment of three time series to image encoding frameworks, namely the signal spectrogram, the Markov Transition Field and the Gramian Angular Field, as well as the development of various multi-input multi-domain networks. The recognition of the gestures is then tackled as a 3-class classification problem (‘Eat’, ‘Drink’ and ‘Null’), where the ‘Null’ class is composed of all the irrelevant gestures included in the post-segmentation gesture set. An average per-class classification accuracy of 97.10% was achieved by the proposed system. When compared to similar work, such accurate classification performance signifies a great contribution to the field of assisted living.