Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics 2007
DOI: 10.3115/1614108.1614162
|View full text |Cite
|
Sign up to set email alerts
|

Speech summarization without lexical features for Mandarin broadcast news

Abstract: We present the first known empirical study on speech summarization without lexical features for Mandarin broadcast news. We evaluate acoustic, lexical and structural features as predictors of summary sentences. We find that the summarizer yields good performance at the average Fmeasure of 0.5646 even by using the combination of acoustic and structural features alone, which are independent of lexical features. In addition, we show that structural features are superior to lexical features and our summarizer perf… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
25
0

Year Published

2007
2007
2015
2015

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 34 publications
(26 citation statements)
references
References 11 publications
1
25
0
Order By: Relevance
“…Probably the most striking finding that will need further verification and analysis in future work are results indicating that very good summarization performance can be achieved on the basis of acoustic features alone, with no recourse to transcripts or textual features [124,158,231].…”
Section: Summarization Of Speechmentioning
confidence: 99%
“…Probably the most striking finding that will need further verification and analysis in future work are results indicating that very good summarization performance can be achieved on the basis of acoustic features alone, with no recourse to transcripts or textual features [124,158,231].…”
Section: Summarization Of Speechmentioning
confidence: 99%
“…Among them, the vector space model (VSM) [9], the latent semantic analysis (LSA) method [9], the Markov random walk (MRW) method [10], the maximum marginal relevance (MMR) method [11], the sentence significant score method [12], the LexRank [13], the submodularity-based method [14], and the integer linear programming (ILP) method [15] are the most popular approaches for spoken document summarization. Apart from that, a number of classification-based methods using various kinds of representative features also have been investigated, such as the Gaussian mixture models (GMM) [9], the Bayesian classifier (BC) [16], the support vector machine (SVM) [17] and the conditional random fields (CRFs) [18], to name just a few. In these methods, important sentence selection is usually formulated as a binary classification problem.…”
Section: Introductionmentioning
confidence: 99%
“…Besides traditional unsupervised summarization methods [3][4][5][6][7][8][9], such as those based on document structural, linguistic or prosodic information, proximity or significance measures and relevance scores to identify salient sentences, machinelearning approaches with supervised training have drawn much attention and been applied with good success in a wide arrange of summarization tasks [3][4][5][6][7][8][9]. Specifically, the problem of speech summarization can be formulated as follows: Construct a ranking model (summarizer) that assigns a decision score (or a posterior probability) of being included in the summary to each sentence of a spoken document to be summarized.…”
Section: Introductionmentioning
confidence: 99%