Proceedings of the 2019 Conference of the North 2019
DOI: 10.18653/v1/n19-1399
|View full text |Cite
|
Sign up to set email alerts
|

Cross-referencing Using Fine-grained Topic Modeling

Abstract: Cross-referencing, which links passages of text to other related passages, can be a valuable study aid for facilitating comprehension of a text. However, cross-referencing requires first, a comprehensive thematic knowledge of the entire corpus, and second, a focused search through the corpus specifically to find such useful connections. Due to this, crossreference resources are prohibitively expensive and exist only for the most well-studied texts (e.g. religious texts). We develop a topic-based system for aut… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
2
1

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 8 publications
0
3
0
Order By: Relevance
“…In this study, we discussed some results and emerging trends and how they can be understandable from the perspective of earlier studies, including our comparisons. The difference between the two methods using bible text as corpora; the results give some indication about how evenly the distribution of words is between the documents (24) . The analysis shows that both techniques find the most significant percentage of instances and assessments of the context in which the words appear that contain words related to God's creation and his mandate for humanity.…”
Section: Discussionmentioning
confidence: 99%
“…In this study, we discussed some results and emerging trends and how they can be understandable from the perspective of earlier studies, including our comparisons. The difference between the two methods using bible text as corpora; the results give some indication about how evenly the distribution of words is between the documents (24) . The analysis shows that both techniques find the most significant percentage of instances and assessments of the context in which the words appear that contain words related to God's creation and his mandate for humanity.…”
Section: Discussionmentioning
confidence: 99%
“…These experiments relate to a large body of work that considers how preprocessing methods affect the downstream accuracy of various algorithms, ranging from topics in information retrieval (Chaudhari et al, 2015;Patil and Atique, 2013;Beil et al, 2002), text classification and regression (Forman, 2003;Yang and Pedersen, 1997;Vijayarani et al, 2015;Kumar and Harish, 2018;HaCohen-Kerner et al, 2020;Symeonidis et al, 2018;Weller et al, 2020), topic modeling (Blei et al, 2003;Lund et al, 2019;Schofield and Mimno, 2016;Schofield et al, 2017a,b), and even more complex tasks like question answering (Jijkoun et al, 2003;Carvalho et al, 2007) and machine translation (Habash, 2007;Habash and Sadat, 2006;Leusch et al, 2005;Weller et al, 2021;Mehta et al, 2020) to name a few. With the rise of noisy social media, text preprocessing has become important for tasks that use data from sources like Twitter and Reddit (Symeonidis et al, 2018;Singh and Kumari, 2016;Bao et al, 2014;Jianqiang, 2015;Weller and Seppi, 2020;Zirikly et al, 2019;Babanejad et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…This hypothesis is also supported by the low correlation between vocabulary size and accuracy, indicating that what is in the vocabulary is more important than its size. These experiments relate to a large body of work that considers how preprocessing methods affect the downstream accuracy of various algorithms, ranging from topics in information retrieval (Chaudhari et al, 2015;Patil and Atique, 2013;Beil et al, 2002), text classification and regression (Forman, 2003;Yang and Pedersen, 1997;Vijayarani et al, 2015;Kumar and Harish, 2018;HaCohen-Kerner et al, 2020;Symeonidis et al, 2018;Weller et al, 2020), topic modeling (Blei et al, 2003;Lund et al, 2019;Schofield and Mimno, 2016;Schofield et al, 2017a,b), and even more complex tasks like question answering (Jijkoun et al, 2003;Carvalho et al, 2007) and machine translation (Habash, 2007;Habash and Sadat, 2006;Leusch et al, 2005;Weller et al, 2021;Mehta et al, 2020) 2020) analyze and cross-compare up to 16 different techniques for four machine learning algorithms. In contrast, our work is the first to examine these preprocessing techniques beyond accuracy, examining them in tandem with how they affect vocabulary size and run-time.…”
Section: Combination Techniquesmentioning
confidence: 99%