Proceedings of the 25th International Conference on Machine Learning - ICML '08 2008
DOI: 10.1145/1390156.1390173
|View full text |Cite
|
Sign up to set email alerts
|

Learning to sportscast

Abstract: We present a novel commentator system that learns language from sportscasts of simulated soccer games. The system learns to parse and generate commentaries without any engineered knowledge about the English language. Training is done using only ambiguous supervision in the form of textual human commentaries and simulation states of the soccer games. The system simultaneously tries to establish correspondences between the commentaries and the simulation states as well as build a translation model. We also prese… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

4
441
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 338 publications
(445 citation statements)
references
References 14 publications
4
441
0
Order By: Relevance
“…A lot of attention has been paid to individual content selection and selective realization sub-problems (Barzilay and Lee, 2004;Barzilay and Lapata, 2005;Liang et al, 2009). Recent works (Chen and Mooney, 2008;Chen et al, 2010;Mei et al, 2016) explore full selective generation and learn alignments between generated texts and input data using a translation model.…”
Section: Related Workmentioning
confidence: 99%
“…A lot of attention has been paid to individual content selection and selective realization sub-problems (Barzilay and Lee, 2004;Barzilay and Lapata, 2005;Liang et al, 2009). Recent works (Chen and Mooney, 2008;Chen et al, 2010;Mei et al, 2016) explore full selective generation and learn alignments between generated texts and input data using a translation model.…”
Section: Related Workmentioning
confidence: 99%
“…The RW corpus we studied is from the sports domain which has attracted great interests (Chen and Mooney, 2008;Mei et al, 2016;Puduppully et al, 2019b). However, unlike generating the one-entity descriptions (Lebret et al, 2016; or having the output strictly bounded by the inputs (Novikova et al, 2017), this corpus poses additional challenges since the targets contain ungrounded contents.…”
Section: Related Workmentioning
confidence: 99%
“…"Noisy" triples, such as: dbr:Sequoyah dbo:occupation dbr:Sequoyah 1 and dbr:Acie Law dbo:termPeriod Acie Law [1][2][3][4][5][6][7][8][9][10], are very common in the DBpedia triples allocated to the Wikipedia biographies. Since, their information is not verbalised in the text, our systems learn to disregard them, explaining the lower scores that these model achieve with respect to the number of summarised triples.…”
Section: Modelmentioning
confidence: 99%
“…The difficulty is that data that is available in knowledge bases needs to be aligned with the corresponding texts. Existing solutions for data-to-text generation either focus mainly on creating a small, domain-specific corpus where data and text are manually aligned by a small group of experts, such as the WeatherGov [9] and RoboCup [10] datasets, or rely heavily on crowdsourcing [11,12], which makes them costly to apply for large domains. We rely on the alignment of DBpedia and Wikidata with Wikipedia in order to create two corpora of knowledge base triples from DBpedia and Wikidata, and their corresponding textual summaries.…”
Section: Introductionmentioning
confidence: 99%