Dual-task costs are often significantly reduced or eliminated when both tasks use compatible stimulus-response (S-R) pairs. Either by design or unintentionally, S-R pairs used in dual-task experiments that produce small dual-task costs typically have two properties that may reduce dual-task interference. One property is that they are easy to keep separate; specifically, one task is often visual-spatial and contains little verbal information and the other task is primarily auditory-verbal and has no significant spatial component. The other property is that the two sets of S-R pairs are often compatible at the set-level; specifically, the collection of stimuli for each task is strongly related to the collection of responses for that task, even if there is no direct correspondence between the individual items in the sets. In this paper, we directly test which of these two properties is driving the absence of large dual-task costs. We used stimuli (images of hands and auditory words) that when previously been paired with responses (button presses and vocal utterances) produced minimal dual-task costs, but we manipulated the shape of the hands in the images and the auditory words. If set-level compatibility is driving efficient performance, then these changes should not affect dual-task costs. However, we found large changes in the dual-task costs depending on the specific stimuli and responses. We conclude that set-level compatibility is not sufficient to minimize dual-task costs. We connect these findings to divisions within the working memory system and discuss implications for understanding dual-task performance more broadly.