This paper explores a Cyber-physical System (CPS) that enables a billboard viewer to instantaneously obtain the snapshot of the displayed media content upon a smartphone gesture. We firstly define an add-on device mounted to each billboard/signage, called Media processor content access box (MP-CAB), which collaborates with the content management server (CMS) and the viewers' smartphones for achieving the desired applications. The detailed design of the MP-CAB will be presented, followed by introducing a simple yet efficient multicast scheduling approach in presence of lossy WiFi link and multiple viewers of heterogeneous receiving modes. We model the response delay of the proposed CPS which jointly considers a set of key parameters, such as file size, percentage of receiver modes, and length of snapshot cycles (SC). Extensive case studies are conducted to provide in-depth analysis and gain insights into the proposed CPS and employed scheduling approach regarding the relationship among several operation parameters. Specifically, we look into the network operation and multicast scheduling settings for achieving minimal expected response delay and maximal image size, aiming to gain sufficient understanding of the behavior of the proposed CPS in the real-time content snapshot acquisition process.