Recent dialogue coherence models use the coherence features designed for monologue texts, e.g. nominal entities, to represent utterances and then explicitly augment them with dialogue-relevant features, e.g., dialogue act labels. It indicates two drawbacks, (a) semantics of utterances is limited to entity mentions, and (b) the performance of coherence models strongly relies on the quality of the input dialogue act labels. We address these issues by introducing a novel approach to dialogue coherence assessment. We use dialogue act prediction as an auxiliary task in a multi-task learning scenario to obtain informative utterance representations for coherence assessment. Our approach alleviates the need for explicit dialogue act labels during evaluation. The results of our experiments show that our model substantially (more than 20 accuracy points) outperforms its strong competitors on the Dai-lyDialogue corpus, and performs on par with them on the SwitchBoard corpus for ranking dialogues concerning their coherence. We release our source code 1 .Recent approaches to dialogue coherence modeling use the coherence features designed for monologue texts, e.g. entity transitions (Barzilay and Lapata, 2005), and augment them with dialoguerelevant features, e.g., DA labels (Cervone et al., 2018). These DA labels are provided by human annotators or DA prediction models. Such coherence models suffer from the following drawbacks: (a) they curb semantic representations of utterances to entities, which are sparse in dialogue because of short utterance lengths, and (b) their performance relies on the quality of their input DA labels.