An increasing number of connected vehicles (CVs) driving together with regular vehicles (RVs) on the road is an inevitable stage of future traffic development. As accurate traffic flow state detection is essential for ensuring safe and efficient traffic, the level of road intelligence is being enhanced by the mass deployment of roadside perception devices, which is capable of sensing the mixed traffic flow consisting of RVs and CVs. In this background, we propose a roadside radar and camera data fusion framework to improve the accuracy of traffic flow state detection, which utilizes relatively more accurate traffic parameters obtained from real-time communication between CVs and roadside unit (RSU) as calibration values for training the back propagation (BP) neural network. Then, with the perception data collected by roadside sensors, the BP neural network-based data fusion model is applied to all vehicles including RVs. Furthermore, considering the changes of road environments, a dynamic BP fusion method is proposed, which adopts dynamic training by updating samples conditionally, and are applied to fuse traffic flow, occupancy and RVs speed data. Simulation results demonstrate that for CVs data and all vehicles (including RVs) data, the proposed dynamic BP fusion method is more accurate than single sensor detection, entropy based Bayesian fusion method and traditional BP fusion without training by CVs. It can achieve smaller error, and the accuracies of vehicle speed, traffic flow, and occupancy are all above 95%.