In this paper, we introduced an unsupervised method to remove fillers in spoken dialogues semiautomatically based on their probability distribution and the effect of removing fillers to induce semantic classes. We conduct the unigram and bigram distribution of fillers on our Chinese voice search data and find that only using these distributions, fillers are in the first 1% of all words. We also test the semantic class induction precision before fillers removing and after fillers removing on both human-tocomputer corpus and human-to-human corpus. After removing fillers, the precision grows from 81.8% to 86.9% in human-to-computer dialogues and from 58.0% to 61.9% in human-to-human dialogues.