It is every parent's wish to protect their children from online pornography, cyber bullying and cyber predators. Several existing approaches analyze a limited amount of information stemming from the interactions of the child with the corresponding online party. Some restrict access to websites based on a blacklist of known forbidden URLs, others attempt to parse and analyze the exchanged multimedia content between the two parties. However, new URLs can be used to circumvent a blacklist, and images, video, and text can individually appear to be safe, but need to be judged jointly. We propose a highly modular framework of analyzing content in its final form at the user interface, or Human Computer Interaction (HCI) layer, as it appears before the child: on the screen and through the speakers. Our approach is to produce Children's Agents for Secure and Privacy Enhanced Reaction (CASPER), which analyzes screen captures and audio signals in real time in order to make a decision based on all of the information at its disposal, with limited hardware capabilities. We employ a collection of deep learning techniques for image, audio and text processing in order to categorize visual content as pornographic or neutral, and textual content as cyberbullying or neutral. We additionally contribute a custom dataset that offers a wide spectrum of objectionable content for evaluation and training purposes. CASPER demonstrates an average accuracy of 88% and an F1 score of 0.85 when classifying text, and an accuracy of 95% when classifying pornography.INDEX TERMS Cyber-bullying, Cyber-grooming, Online Safety, Pornography filter, Real time agent.