Abstract-Dynamic Movement Primitives (DMPs) are widely used for encoding motion data. Task parameterized DMP (TP-DMP) can adapt a learned skill to different situations. Mostly a customized vision system is used to extract task specific variables. This limits the use of such systems to real world scenarios. This paper proposes a method for combining the DMP with a Convolutional Neural Network (CNN). Our approach preserves the generalization properties associated with a DMP, while the CNN learns the task specific features from the camera images. This eliminates the need to extract the task parameters, by directly utilizing the camera image during the motion reproduction. The performance of the developed approach is demonstrated through a trash cleaning task, executed with a real robot. We also show that by using the data augmentation, the learned sweeping skill can be generalized for arbitrary objects. The experiments show the robustness of our approach for several different settings.