This study aimed to determine the unit of interaction between visual working memory (VWM) and attention. Therefore, we examined two opposing hypotheses: (a) the unit of interaction is a Boolean map, which is a data format that can contain only one within-dimension feature (e.g., “red” or “circle”; Boolean-map-unit hypothesis); and (b) the unit of interaction is an object (object-unit hypothesis). In two experiments, participants held in their VWM two colors from either one or two objects, or one color, and then performed a search task that sometimes contained a distractor with a memory-matching color. The results showed that the attentional capture by two different colors encoded from one integrated object was equivalent to that of a single color, and was much stronger than that of two colors from separate objects, which supports the object-unit hypothesis. These findings have crucial implications for understanding the architecture of interaction between VWM and attention.