The sheer volume of IoT networks being deployed today presents a major "attack surface" and poses significant security risks at a scale never encountered before. In other words, a single IoT device/node that gets infected with malware has the potential to spread the malicious activities across the network, eventually ceasing the network functionality or compromising the network. Simply detecting and quarantining the malware in IoT networks does not guarantee preventing malware propagation. On the other hand, use of traditional control theory for malware confinement is not effective, as most of the existing works do not consider real-time malware control strategies that can be implemented using uncertain infection information from the nodes in the network or have the containment problem decoupled from network performance. In response, in this work, we propose a two-pronged approach with malware detection at nodelevel, and confinement of malware at network-level. We deploy a recently proposed lightweight runtime malware detector at the node-level that employs Hardware Performance Counter (HPC) values for malware detection. This node-level malware information is combined with the malware propagation information and then fed during runtime to a stochastic predictive controller to confine the malware propagation without hampering the network performance. Synthesizing the node-level malware information with the model predictive containment strategy leads to achieving an average network throughput of nearly 200% of that of IoT network without any defense, and up to 160% of that of network with commonly employed state-ofthe-art heuristic approaches for malware confinement. Furthermore, to scale with ever-increasing network topology sizes, we introduce a novel multi-attribute graph translation that can predict the network topology and node state information when provided with a snapshot of topology and node-level malware infection. The proposed multi-attribute graph translation has <5.88 Root Mean Square Error (RMSE) compared to the model predictive containment strategy and has shown nearly constant graph translation time and limited resource utilization independent of the network size.