Learning Rich Representation of Keyphrases from Text

Kulkarni, Mayank; Mahata, Debanjan; Arora, Ravneet; Bhowmik, Rajarshi

doi:10.48550/arxiv.2112.08547

Cited by 4 publications

(6 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Recently, a novel pre-trained model, known as KeyBART, has been introduced for acquiring rich keyphrase representations, resulting in enhanced keyphrase generation performance by harnessing the robust capabilities of the BART architecture [4]. Experi-mental findings demonstrate that KeyBART surpasses state-of-the-art methods in terms of performance for the keyphrase generation task.…”

Section: Keyphrase Generationmentioning

confidence: 99%

“…, C k , the keyphrase generator generates keyphrases for each cluster C i . In this work, we use KeyBART [4] as the keyphrase generator. KeyBART is a task-specific language model that learns rich representation of keyphrases from text documents by using different masking strategies for pre-training transformer language models.…”

Section: Keyphrase Generator Modulementioning

confidence: 99%

“…They have been widely used for various text generation tasks, such as text summarization and keyphrase generation. Among them, KeyBART [4] is a generative pretrained language model that learns rich keyphrase representation and can improve many downstream NLP tasks, especially the keyphrase generation task.…”

Section: Introductionmentioning

confidence: 99%

“…We adopt the same evaluation metric as used in the original KeyBART paper [4] to select the best configuration on the validation set and for evaluation on the test set. The chosen metric is the F-score@M, which is a harmonic mean of precision and recall, considering the top-M generated keywords.…”

mentioning

confidence: 99%

See 3 more Smart Citations

TrendFlow: A Machine Learning Framework for Research Trend Analysis

et al. 2023

View full text Add to dashboard Cite

As various research fields continue to evolve, new technologies emerge constantly, making it challenging for scholars to keep up with the latest and most promising research directions. To address this issue, we propose TrendFlow, a framework that leverages machine learning and deep learning techniques for analyzing research trends. TrendFlow first searches relevant literature based on user-defined queries, then clusters the searched literature according to the abstracts, and finally generates keyphrases of the abstracts as research trends for each cluster. Our experimental results highlight the superior performance of TrendFlow compared to traditional literature analysis tools. We have released the beta version of TrendFlow on Huggingface.

show abstract

Section: Keyphrase Generationmentioning

confidence: 99%

Section: Keyphrase Generator Modulementioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

TrendFlow: A Machine Learning Framework for Research Trend Analysis

et al. 2023

View full text Add to dashboard Cite

show abstract

“…The classification-based approach involves extracting keywords from a document by evaluating every token in the document to determine if it is a keyword or not [21][22][23]. In contrast, the generation-based approach uses a generative language model to abstractively generate keywords for an input document [24,25]. According to numerous studies [25], generative language models like BART [26] outperform classification-based methods in extraction accuracy, making the generation-based approach more commonly adopted for keyword extraction.…”

Section: Introductionmentioning

confidence: 99%

RoBERTa-Based Keyword Extraction from Small Number of Korean Documents

Kim,

Lee,

Park

et al. 2023

Electronics

View full text Add to dashboard Cite

Keyword extraction is the task of identifying essential words in a lengthy document. This process is primarily executed through supervised keyword extraction. In instances where the dataset is limited in size, a classification-based approach is typically employed. Therefore, this paper introduces a novel keyword extractor based on a classification approach. The proposed keyword extractor comprises three key components: RoBERTa, a keyword estimator, and a decision rule. RoBERTa encodes an input document, the keyword estimator calculates the probability of each token in the document becoming a keyword, and the decision rule ultimately determines whether each token is a keyword based on these probabilities. However, training the proposed model with a small dataset presents two challenges. One problem is the case that all tokens in the documents are not a keyword, and the other problem is that a single word can be composed of keyword tokens and non-keyword tokens. Two novel heuristics are thus proposed to tackle these problems. To address these issues, two novel heuristics are proposed. These heuristics have been extensively tested through experiments, demonstrating that the proposed keyword extractor surpasses both the generation-based approach and the vanilla RoBERTa in environments with limited data. The efficacy of the heuristics is further validated through an ablation study. In summary, the proposed heuristics have proven to be effective in developing a supervised keyword extractor with a small dataset.

show abstract

Capturing the Concept Projection in Metaphorical Memes for Downstream Learning Tasks

Acharya,

Das,

Sudarshan

2024

IEEE Access

View full text Add to dashboard Cite

Metaphorical memes, where a source concept is projected into a target concept, are an essential construct in figurative language. In this article, we present a novel approach for downstream learning tasks on metaphorical multimodal memes. Our proposed framework replaces traditional methods using metaphor annotations with a metaphor-capturing mechanism. Besides using the significant zero-shot learning capability of state-of-the-art pretrained encoders, this work introduces an alternative external knowledge enhancement strategy based on ChatGPT (chatbot generative pretrained transformer), demonstrating its effectiveness in bridging the intermodal semantic gap. We propose a new concept projection process consisting of three distinct components to capture the intramodal knowledge and intermodal concept gap in the forms of text modality embedding, visual modality embedding, and concept projection embedding. This approach leverages the attention mechanism of the Graph Attention Network for fusing the common aspects of external knowledge related to the knowledge in the text and image modality to implement the concept projection process. Our experimental results demonstrate the superiority of our proposed approach compared to existing methods.

show abstract

Learning Rich Representation of Keyphrases from Text

Cited by 4 publications

References 0 publications

TrendFlow: A Machine Learning Framework for Research Trend Analysis

TrendFlow: A Machine Learning Framework for Research Trend Analysis

RoBERTa-Based Keyword Extraction from Small Number of Korean Documents

Capturing the Concept Projection in Metaphorical Memes for Downstream Learning Tasks

Contact Info

Product

Resources

About