The present electroencephalogram study used an attention probe paradigm to investigate how semantic and acoustic structures constrain temporal attention during speech comprehension. Spoken sentences were used as stimuli, with each one containing a four-character critical phrase, of which the third character was the target character. We manipulated not only the semantic relationship between the target character and the immediately preceding two characters, but also the presence/absence of a pitch accent on the first character. In addition, an attention probe was either presented concurrently with the target character or not. The results showed that the N1 effect evoked by the attention probe was of larger amplitude and started earlier (enhanced attention) when the target character and the preceding two characters belonged to the same semantic event than when they spanned a semantic-event boundary, and this effect occurred only in the un-accented conditions. The results suggest that, during speech comprehension, the semantic level of event-structure can constrain attention allocation along the temporal dimension, and reverse the attention attenuation effect of prediction; meanwhile, the semantic and acoustic levels of event-structure interact with each other immediately to modulate auditory-temporal attention. The results were discussed with regard to the predictive coding account of attention.