We propose an algorithm for detecting the mouth events of opening and closing. Our method is translation and rotation invariant, works at very fast speeds, and does not require segmented lips. The approach is based on a recently developed optical flow algorithm that handles the motion of linear structure in a stable and consistent way.Furthermore, we provide a semi-automatic tool for generating groundtruth segmentation of video data, also based on the optical flow algorithm used for tracking keypoints at faster than 200 frames/second. We provide groundtruth for 50 sessions of speech of the XM2VTS database [16] available for download, and the means to segment further sessions at a relatively small amount of user interaction.We use the generated groundtruth to test the proposed algorithm for detecting events, and show it to yield promising result. The semi-automatic tool will be a useful resource for researchers in need of groundtruth segmentation from video for the XM2VTS database and others.