2022 IEEE International Conference on Big Data (Big Data) 2022
DOI: 10.1109/bigdata55660.2022.10020935
|View full text |Cite
|
Sign up to set email alerts
|

Applications of data analysis on scholarly long documents

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 17 publications
0
4
0
Order By: Relevance
“…An entire document might be divided into sections that must include Introduction and Conclusion for all academic disciplines and valid structure. Statistical-based NLP techniques or trained models are not ft for academic research documents since they contain domain-specifc jargon, and writing style may vary widely [37]. In addition, the machine learning approach for working with text in documents requires a corpus of labeled data that is difcult and expensive for widespread forms of document structure or content [2,37,38].…”
Section: Information Extraction From Large Documentsmentioning
confidence: 99%
See 2 more Smart Citations
“…An entire document might be divided into sections that must include Introduction and Conclusion for all academic disciplines and valid structure. Statistical-based NLP techniques or trained models are not ft for academic research documents since they contain domain-specifc jargon, and writing style may vary widely [37]. In addition, the machine learning approach for working with text in documents requires a corpus of labeled data that is difcult and expensive for widespread forms of document structure or content [2,37,38].…”
Section: Information Extraction From Large Documentsmentioning
confidence: 99%
“…Statistical-based NLP techniques or trained models are not ft for academic research documents since they contain domain-specifc jargon, and writing style may vary widely [37]. In addition, the machine learning approach for working with text in documents requires a corpus of labeled data that is difcult and expensive for widespread forms of document structure or content [2,37,38]. Te technique of information extraction from large documents is often preprocessed by building the document's representation from its constituent units, such as sentences or other symbolic tokens, which can be aggregated [38].…”
Section: Information Extraction From Large Documentsmentioning
confidence: 99%
See 1 more Smart Citation
“…BERT [20] and SciBERT [8] are some popular language models that uses transformers to perform various sequence tasks including classification. To obtain more tailored results, we can fine-tune base models [7]. We use language models for our chapter classification work.…”
Section: Services Backgroundmentioning
confidence: 99%