Environment perception using camera, radar, and/or lidar sensors has significantly improved in the last few years because of deep learning-based methods. However, a large group of these methods fall into the category of supervised learning, which requires a considerable amount of annotated data. Due to uncertainties in multi-sensor data, automating the data labeling process is extremely challenging; hence, it is performed manually to a large extent. Even though full automation of such a process is difficult, semiautomation can be a significant step to ease this process. However, the available work in this regard is still very limited; hence, in this paper, a novel semi-automatic annotation methodology is developed for labeling RGB camera images and 3D automotive radar point cloud data using a smart infrastructure-based sensor setup. This paper also describes a new method for 3D radar background subtraction to remove clutter and a new object category, GROUP, for radar-based object detection for closely located vulnerable road users. To validate the work, a dataset named INFRA-3DRC is created using this methodology, where 75% of the labels are automatically generated. In addition, a radar cluster classifier and an image classifier are developed, trained, and tested on this dataset, achieving accuracy of 98.26% and 94.86%, respectively. The dataset and Python scripts are available at https://fraunhoferivi.github.io/INFRA-3DRC-Dataset/.