The detection of crowd density levels and anomalies is a hot topic in video surveillance. Especially in human-centric action and activity-based movements. In some respects, the density level variation is considered an anomaly in the event. Crowd behaviour identification relies on a computer-vision-based approach and basically deals with spatial information of foreground video information. In this work, we focused on a deep-learning-based attention-oriented classification system for identifying several basic movements in public places, especially, human flock movement, sudden motion changes and panic events in several indoor and outdoor places. The important spatial features were extracted from a bilinear CNN and a multicolumn multistage CNN with preprocessed morphological video frames from videos. Finally, the abnormal and crowd density estimation was distinguished by using an attention feature combined with a multilayer CNN feature by modifying the fully connected layer for several categories (binary and multiclass). We validate the proposed method on several video surveillance datasets including PETS2009, UMN and UCSD. The proposed method achieved an accuracy of 98.62, 98.95, 96.97, 99.10 and 98.38 on the UCSD Ped1, UCSD Ped2, PETS2009, UMN Plaza1 and UMN Plaza2 datasets, respectively, with the different pretrained models. We compared the performance between recent modern approaches and the proposed method (MCMS-BCNN-Attention) and achieved the highest accuracy. The anomaly detection performance on the UMN and PETS2009 datasets was compared with that of a state-of-the-art method and achieved the best AUC results as 0.9953 and 1.00 for both scenarios, respectively, with a binary classification.