Grounding in dialogue concerns the question of how the gap between the individual symbol systems of interlocutors can be bridged so that mutual understanding is possible. This problem is highly relevant to human-agent interaction where mis-or non-understanding is common. We argue that humans minimise this gap by collaboratively and iteratively creating a shared conceptualisation that serves as a basis for negotiating symbol meaning. We then present a computational model that enables an artificial conversational agent to estimate the user's mental state (in terms of contact, perception, understanding, acceptance, agreement and based upon his or her feedback signals) and use this information to incrementally adapt its ongoing communicative actions to the user's needs. These basic abilities are important to reduce friction in the iterative coordination process of co-constructing grounded symbols in dialogue.