Currently, 3D objects are usually represented by 3D bounding boxes. Much research work has focused on detecting 3D objects directly from point clouds, and significant progress has been made in this field. However, we find there are there is still room for improvement in three aspects. First is point cloud feature extraction. Many successful methods are based on PointNet/PointNet++, which uses multi-layer perceptrons (MLP) to extract features to generate seed points, without considering foreground and background clues. The second aspect is grouping. The “vote-based cluster” grouping method defined by the pioneering VoteNet ignores shape information that is very important in the object detection field. The final aspect is the modeling ability of grouped clusters. Most successful methods treat grouped clusters separately, regardless of their different contributions to the final detection. To address these challenges, we propose three modules to address them: the foreground-aware module, the voting-aware module, and the cluster-aware module. Extensive experiments on two large datasets of real 3D scans, ScanNet and SUN RGB-D, demonstrate the effectiveness of our method for 3D object detection on point clouds.