Previous studies demonstrate that semantics, the higher level meaning of multi-modal stimuli, can impact multisensory integration. Valence, an affective response to images, has not yet been tested in non-priming response time (RT) or temporal order judgement (TOJ) tasks. This study aims to investigate both semantic congruency and valence of non-speech audiovisual stimuli on multisensory integration via RT and TOJ tasks (assessing processing speed (RT), point of subjective simultaneity (PSS), and time-window when multisensory stimuli are likely to be perceived as simultaneous (Temporal Binding Window; TBW)). Forty participants (mean age: 26.25; females=17) were recruited from Prolific Academic resulting in 37 complete datasets. Both congruence and valence have a significant main effect on RT (congruent and high valence decrease RT) as well as an interaction effect (congruent/high valence condition being significantly faster than all others). For TOJ, images high in valence require visual stimuli to be presented significantly earlier than auditory stimuli in order for the audio and visual stimuli to be perceived as simultaneous. Further, a significant interaction effect of congruence and valence on the PSS revealed that the congruent/high valence condition was significantly earlier than all other conditions. A subsequent analysis shows there is a positive correlation between the TBW width (b-values) and RT (as the TBW widens, the RT increases) for the categories that differed most from 0 in their PSS (Congruent/High and Incongruent/Low). This study provides new evidence that supports previous research on semantic congruency and presents a novel incorporation of valence into behavioural responses.