Visual working memory (vWM) plays a crucial role in visual information processing and higher cognitive functions; however, it has a very limited capacity. Recently, several studies have successfully modulated vWM capacity in humans using entrainment with transcranial alternate current stimulation (tACS) by targeting parietal theta in a frequency-specific manner. In the current study, we aim to expand upon these findings by utilizing sensory instead of electrical stimulation. Across six behavioral experiments (combined
N
= 209), we applied rhythmic visual and auditory sensory stimulation at 4 Hz and 7 Hz, aiming to modulate vWM capacity. Collectively, the results showed an overall robust improvement with sensory stimulation at either frequency, compared to baseline. However, contrary to our prediction, 7 Hz stimulation tended to slightly outperform 4 Hz stimulation. Importantly, the observed facilitatory effect was mainly driven by the low-capacity sub-group of participants. Follow-up experiments using the Attention Network Test (ANT) and pupillometry measures did not find evidence that this effect could be directly attributed to modulation of phasic or tonic arousal. We speculate that our results differed from those obtained with tACS due to targeting functionally different theta oscillations, or the modulation of participants’ temporal expectations.