Speakers necessarily monitor their conversations and apply adaptive control to their own speech, including halting it altogether sometimes, to ensure that the words they produce are not only linguistically accurate but also contextually appropriate. However, we know relatively little about the neurocognitive mechanisms of monitoring and control engaged in speech production. The experimental series contained within this thesis investigated two key topics relevant to monitoring and control in spoken word production, integrating both behavioural and neuroscience explanations and evidence. The first topic concerns the recent finding of taboo distractor word interference in the picture-word interference (PWI) paradigm. Across a series of experiments, we contrast proposed mechanisms of the effect based on post-lexical monitoring of socially inappropriate (i.e., embarrassing/offensive) stimuli versus early attention capture due to the arousing nature of taboo stimuli. Our behavioural results show that interference during spoken word production from taboo language arises relatively early (minimally, prior to phonological encoding) and is associated with a distributed thalamo-cortical network in the brain during functional magnetic resonance imaging (fMRI). These findings lend support to arousal-based attention-capture rather than post-lexical monitoring theories. The second topic concerned the perceptual loop theory's proposal that inner speech monitoring is based on comparisons at a phonological level of representation (Levelt, 1989;Wheeldon & Levelt, 1995). According to this account, the same comprehension system used to perceive our own and others' produced speech (outer loop) also inspects our pre-articulatory speech (inner loop). Being interrupted by a conversational partner mid-production should therefore place increased demands on a speaker's inner and outer monitoring loops before production is halted. To test this proposal, we used a modified stop-signal paradigm using picture naming with auditory words phonologically related and unrelated to target pictures. However, we failed to observe any influence of phonological similarity on halting performance. An fMRI experiment showed successful versus unsuccessful halting of speech was associated with increased BOLD signal bilaterally in the posterior middle temporal, frontal, and parietal lobes and decreases bilaterally in the posterior and left anterior superior temporal gyrus and right inferior frontal gyrus. We also investigated whether halting is influenced by presenting the correct picture name as a go-signal to continue production, changing the decision process to stop versus go. Our results show the ability to halt production is influenced by phonological similarity only when both go-and stop-signals are presented. This suggests speakers are able to strategically slow their responses in order to discriminate task-relevant information (i.e., competing word forms) in inner and external speech.Based on our findings we argue future speech production accounts need ...