This work examines how a forced-attention technique can be applied to the task of Video Activity Recognition. The Look&Learn system performs early fusion of critical detected areas of attention with the original raw image data for training a system for video activity recognition, specifically the task of Squat "Quality" Detection. Look & Learn is compared to previous work, USquat, and achieved a 98.96% accuracy on average compared to the USquat system which achieved 93.75% accuracy demonstrating the improvement that can be gained by Look&Learn's forced-attention technique. Look&Learn is deployed in an Android Application for proof of concept and results presented.