Interspeech 2020 2020
DOI: 10.21437/interspeech.2020-2153
|View full text |Cite
|
Sign up to set email alerts
|

An Effective Domain Adaptive Post-Training Method for BERT in Response Selection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
89
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
3

Relationship

1
9

Authors

Journals

citations
Cited by 72 publications
(89 citation statements)
references
References 0 publications
0
89
0
Order By: Relevance
“…Post-training refers to the process of performing additional unsupervised training to a PLM such as BERT using unlabeled domain-specific data, prior to fine-tuning. It has been shown that this leads to improved performance by helping the PLM to adapt to the target domain [46][47][48][49]. We start with a monolingual PLM in L S and completely adapt it to L T .…”
Section: Transfer Learning As Post-trainingmentioning
confidence: 99%
“…Post-training refers to the process of performing additional unsupervised training to a PLM such as BERT using unlabeled domain-specific data, prior to fine-tuning. It has been shown that this leads to improved performance by helping the PLM to adapt to the target domain [46][47][48][49]. We start with a monolingual PLM in L S and completely adapt it to L T .…”
Section: Transfer Learning As Post-trainingmentioning
confidence: 99%
“…Similar to BERT, we use special tokens like [CLS] to denote the beginning of the sequence, and [SEP] to separate the two modalities. Moreover, to inject the multi-turn dialog structure into the model, we utilize a special token [EOT] to denote end of turn (Whang et al, 2019), which informs the model when the dialog turn ends. As such, we prepare the input sequence into the format as…”
Section: Vision-dialog Transformer Encodermentioning
confidence: 99%
“…Baseline Models. Here we mainly introduce the state-of-the-art baseline: BERT-DPT (Whang et al, 2019), which fine-tunes BERT by optimizing the domain post-training (DPT) loss comprising both NSP and MLM objectives for response selection. Details of other baselines can be found in Appendix.…”
Section: Experiments Iv: Evaluation On New Taskmentioning
confidence: 99%