A fine-grained self-adapting prompt learning approach for few-shot learning with pre-trained language models

Chen, Xiaojun; Liu, Ting; Fournier-Viger, Philippe; Zhang, Bowen; Long, Guodong; Zhang, Qin

doi:10.1016/j.knosys.2024.111968

Knowledge-Based Systems

2024

DOI: 10.1016/j.knosys.2024.111968

|View full text |Cite

A fine-grained self-adapting prompt learning approach for few-shot learning with pre-trained language models

Xiaojun Chen,

Ting Liu,

Philippe Fournier-Viger

et al.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Self Data Augmentation for Open Domain Question Answering

Zhang,

Zheng,

Chen

et al. 2024

ACM Trans. Inf. Syst.

View full text Add to dashboard Cite

Information Retrieval (IR) constitutes a vital facet of Open-Domain Question Answering (ODQA) systems, focusing on the exploration of pertinent information within extensive collections of passages, such as Wikipedia, to facilitate subsequent reader processing. Historically, information retrieval relied on textual overlaps for relevant context retrieval, employing methods like BM25 and TF-IDF, which, however, lacked natural language understanding. The advent of deep learning ushered in a new era, leading to the introduction of Dense Passage Retrievers (DPR), shows superiority over traditional sparse retrievers. These dense retrievers leverage Pre-trained Language Models (PLMs) to initialize context encoders, enabling the extraction of natural language representations. They utilize the distance between latent vectors of contexts as a metric for assessing similarity. However, DPR methods are heavily reliant on large volumes of meticulously labeled data, such as Natural Questions. The process of data labeling is both costly and time-intensive. In this paper, we propose a novel data augmentation methodology SDA (Self Data Augmentation) that employs DPR models to automatically annotate unanswered questions. Specifically, we initiate the process by retrieving relevant pseudo passages for these unlabeled questions. We subsequently introduce three distinct passage selection methods to annotate these pseudo passages. Ultimately, we amalgamate the pseudo-labeled passages with the unanswered questions to create augmented data. Our experimental evaluations conducted on two extensive datasets (Natural Questions and TriviaQA), alongside a reletively small dataset (WebQuestions), utilizing three diverse base models, illustrate the significant enhancement achieved through the incorporation of freshly augmented data. Moreover, our proposed data augmentation method exhibits remarkable flexibility, which is readily adaptable to various dense retrievers. Additionally, we have conducted a comprehensive human study on the augmented data, which further supports our conclusions.

show abstract

Self Data Augmentation for Open Domain Question Answering

Zhang,

Zheng,

Chen

et al. 2024

ACM Trans. Inf. Syst.

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

A fine-grained self-adapting prompt learning approach for few-shot learning with pre-trained language models

Cited by 1 publication

References 18 publications

Self Data Augmentation for Open Domain Question Answering

Self Data Augmentation for Open Domain Question Answering

Contact Info

Product

Resources

About