Facial micro-expressions can reveal a person's actual mental state and emotions. Therefore, it has crucial applications in many fields, such as lie detection, clinical medicine, and defense security. However, conventional methods have extracted features on designed facial regions to recognize microexpressions, failing to effectively hit the micro-expression critical regions since micro-expressions are localized and asymmetric. Consequently, we propose the Haphazard Cuboids (HC) feature extraction method, which generates target regions by haphazard sampling technique and then extracts micro-expression spatio-temporal features. HC consists of two modules: spatial patches generation (SP G) and temporal segments generation (T SG). SP G is assigned to generate localized facial regions, and T SG is dedicated to generating temporal intervals. Through extensive experiments, we demonstrate the superiority of the proposed method. Afterward, we analyze two modules with conventional and deep-learning methods and find that they can significantly improve the performance of micro-expression recognition, respectively. Thereinto, we embed the SP G module into deep learning and experimentally demonstrate the effectiveness and superiority of our proposed sampling method in comparison with state-of-the-art methods. Furthermore, we analyze the T SG module with the maximum overlapping interval (M OI) method and find its coherence with the maximum interval of the apex frame distribution in CASME II and SAMM. Therefore, analogous to the human face's region of interest (ROI), micro-expressions also inherit similar ROI in the temporal dimension, whose positions are highly relevant to the intensive moment, i.e., the apex frame.INDEX TERMS Feature extraction, haphazard sampling, micro-expression recognition, ROI