Methods for measuring of eating behavior (known as meal microstructure) often rely on manual annotation of bites, chews, and swallows on meal videos or wearable sensor signals. The manual annotation may be time consuming and erroneous, while wearable sensors may not capture every aspect of eating (e.g. chews only). The aim of this study is to develop a method to detect and count bites and chews automatically from meal videos. The method was developed on a dataset of 28 volunteers consuming unrestricted meals in the laboratory under video observation. First, the faces in the video (regions of interest, ROI) were detected using Faster R-CNN. Second, a pre-trained AlexNet was trained on the detected faces to classify images as a bite/no bite image. Third, the affine optical flow was applied in consecutively detected faces to find the rotational movement of the pixels in the ROIs. The number of chews in a meal video was counted by converting the 2-D images to a 1-D optical flow parameter and finding peaks. The developed bite and chew count algorithm was applied to 84 meal videos collected from 28 volunteers. A mean accuracy (±STD) of 85.4% (±6.3%) with respect to manual annotation was obtained for the number of bites and 88.9% (±7.4%) for the number of chews. The proposed method for an automatic bite and chew counting shows promising results that can be used as an alternative solution to manual annotation.