The purpose of this workshop was to unite the automatic speech recognition (ASR) and natural language processing (NLP) communities to discuss new frameworks for exploiting the rich information present in the speech signal to improve the capabilities of natural language processing applications. Our community objective is to revisit the conventional NLP problems with a focus on incorporating the richness of spoken language, as well as to encourage research contributions that promote cross-fertilization between statistical methods for ASR and NLP.Our inaugural workshop was held at EMNLP to encourage participation amongst the NLP community to consider and discuss the challenges of combining speech recognition with conventional NLP research, as well as to appreciate the recent successes in this exciting field. The authors in these proceedings have combined ASR and NLP in works that address part-of-speech tagging, constituency parsing and dependency parsing on speech, information extraction and spoken term detection, dialog state tracking and speech translation, as well as two research assessments that evaluate the fluency and adequacy of English speakers and the role of speech silence in conversational dialogs.The invited talk was given by Gabriel Skantze, entitled "Modelling turn-taking in spoken interaction."Our workshop also contained an open round-table discussion about the current state of speech-centric NLP and some of the research and pragmatic issues that raise a barrier of entry for the larger research community.We would like to thank the members of the Program Committee for their reviews, as well as our panelists who led our round-
AbstractSilence is an integral part of the most frequent turn-taking phenomena in spoken conversations. Silence is sized and placed within the conversation flow and it is coordinated by the speakers along with the other speech acts. The objective of this analytical study is twofold: to explore the functions of silence with duration of one second and above, towards information flow in a dyadic conversation utilizing the sequences of dialog acts present in the turns surrounding the silence itself; and to design a feature space useful for clustering the silences using a hierarchical concept formation algorithm. The resulting clusters are manually grouped into functional categories based on their similarities. It is observed that the silence plays an important role in response preparation, also can indicate speakers' hesitation or indecisiveness. It is also observed that sometimes long silences can be used deliberately to get a forced response from another speaker thus making silence a multi-functional and an important catalyst towards information flow.