e most common use of robots is to effectively decrease the human's effort with desirable output. In the human-robot interaction, it is essential for both parties to predict subsequent actions based on their present actions so as to well complete the cooperative work. A lot of effort has been devoted in order to attain cooperative work between human and robot precisely. In case of decision making , it is observed from the previous studies that short-term or midterm forecasting have long time horizon to adjust and react. To address this problem, we suggested a new vision-based interaction model. e suggested model reduces the error amplification problem by applying the prior inputs through their features, which are repossessed by a deep belief network (DBN) though Boltzmann machine (BM) mechanism. Additionally, we present a mechanism to decide the possible outcome (accept or reject). e said mechanism evaluates the model on several datasets. Hence, the systems would be able to capture the related information using the motion of the objects. And it updates this information for verification, tracking, acquisition, and extractions of images in order to adapt the situation. Furthermore, we have suggested an intelligent purifier filter (IPF) and learning algorithm based on vision theories in order to make the proposed approach stronger. Experiments show the higher performance of the proposed model compared to the state-of-the-art methods.