In this paper we address the problem of turn-taking prediction in open-ended communication between humans and dialogue agents. In a non-task-oriented interaction with dialogue agents, user inputs are apt to be grammatically and lexically diverse, and at times quite lengthy, with many pauses; all of this makes it harder for the system to decide when to jump in. As a result recent turn-taking predictors designed for specific tasks or for human-human interactions will scarcely be applicable. In this paper we focus primarily on the predictive potential of linguistic features, including lexical, syntactic and semantic features, as well as timing features, whereas past work has typically placed more emphasis on prosodic features, sometimes supplemented with non-verbal behaviors such as gaze and head movements. The basis for our study is a corpus of 15 "friendly" dialogues between humans and a (Wizard-of-Oz enabled) virtual dialogue agent, annotated for pause times and types. The model of turn-taking obtained by supervised learning predicts turn-taking points with increasing accuracy using only prosodic features, only timing and speech rate features, only lexical and syntactic features, and achieves state-of-the art performance with a mixture-of-experts model combining these features along with a semantic criterion.