As one of the pioneering data representations, the point cloud has shown its straightforward capacity to depict fine geometry in many applications, including computer graphics, molecular structurology, modern sensing signal processing, and more. However, unlike computer graphs obtained with auxiliary regularization techniques or from syntheses, raw sensor/scanner (metric) data often contain natural random noise caused by multiple extrinsic factors, especially in the case of high-speed imaging scenarios. On the other hand, grid-like imaging techniques (e.g., RGB images or video frames) tend to entangle interesting aspects with environmental variations such as pose/illuminations with Euclidean sampling/processing pipelines. As one such typical problem, 3D Facial Expression Recognition (3D FER) has been developed into a new stage, with remaining difficulties involving the implementation of efficient feature abstraction methods for high dimensional observations and of stabilizing methods to obtain adequate robustness in cases of random exterior variations. In this paper, a localized and smoothed overlapping kernel is proposed to extract discriminative inherent geometric features. By association between the induced deformation stability and certain types of exterior perturbations through manifold scattering transform, we provide a novel framework that directly consumes point cloud coordinates for FER while requiring no predefined meshes or other features/signals. As a result, our compact framework achieves 78.33% accuracy on the Bosphorus dataset for expression recognition challenge and 77.55% on 3D-BUFE.