WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human Preferences

Liu, Xiao; Lai, Hanyu; Yu, Hao; Xu, Yifan; Zeng, Aohan; Du, Zhengxiao; Zhang, Peng; Dong, Yuxiao; Tang, Jie

doi:10.1145/3580305.3599931

Cited by 17 publications

(4 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The pre-trained base model is fine-tuned, and the resulting RedGPT model (Yang et al, 2023b) is further used for instruction generation in an iterative manner to obtain a massive amount of high-quality data. WebGLM-QA (Liu et al, 2023e) generates data in three stages: Prompt Formulation, Instruction Inducting, and Few-shot In-context Learning. Wizard evol instruct 196K (Xu et al, 2023b) and Wizard evol instruct 70K (Xu et al, 2023b) use the Evol-Instruct method, subjecting 175 seed instructions to four evolution stages to enhance the complexity of generated instructions.…”

Section: Model Constructed Datasetsmentioning

confidence: 99%

See 1 more Smart Citation

Datasets for Large Language Models: A Comprehensive Survey

Liu,

Cao,

Liu

et al. 2024

Preprint

View full text Add to dashboard Cite

This paper embarks on an exploration into the Large Language Model (LLM) datasets, which play a crucial role in the remarkable advancements of LLMs. The datasets serve as the foundational infrastructure analogous to a root system that sustains and nurtures the development of LLMs. Consequently, examination of these datasets emerges as a critical topic in research. In order to address the current lack of a comprehensive overview and thorough analysis of LLM datasets, and to gain insights into their current status and future trends, this survey consolidates and categorizes the fundamental aspects of LLM datasets from five perspectives: (1) Pre-training Corpora; (2) Instruction Fine-tuning Datasets; (3) Preference Datasets; (4) Evaluation Datasets; (5) Traditional Natural Language Processing (NLP) Datasets. The survey sheds light on the prevailing challenges and points out potential avenues for future investigation. Additionally, a comprehensive review of the existing available dataset resources is also provided, including statistics from 444 datasets, covering 8 language categories and spanning 32 domains. Information from 20 dimensions is incorporated into the dataset statistics. The total data size surveyed surpasses 774.5 TB for pre-training corpora and 700M instances for other datasets. We aim to present the entire landscape of LLM text datasets, serving as a comprehensive reference for researchers in this field and contributing to future studies. Related resources are available at: \href{https://github.com/lmmlzn/Awesome-LLMs-Datasets}{https://github.com/lmmlzn/Awesome-LLMs-Datasets}.

show abstract

Section: Model Constructed Datasetsmentioning

confidence: 99%

“…In the end, approximately 240K instructions are obtained. • WebGLM-QA (Liu et al, 2023e). The WebGLM-QA dataset is designed for training the WebGLM generation module and comprises approximately 43K high-quality samples.…”

Section: B12 Model Constructed Datasetsmentioning

confidence: 99%

Datasets for Large Language Models: A Comprehensive Survey

Liu,

Cao,

Liu

et al. 2024

Preprint

View full text Add to dashboard Cite

show abstract

“…bGLM (Liu et al 2023b) augments LLMs with web search and retrieval capabilities. One major limitation of these approaches is the retrieved text is question-related, thus cannot guarantee the correctness of the question-unrelated portions in the generations.…”

Section: Related Workmentioning

confidence: 99%

Mitigating Large Language Model Hallucinations via Autonomous Knowledge Graph-Based Retrofitting

Guan,

Liu,

Lin

et al. 2024

AAAI

View full text Add to dashboard Cite

Incorporating factual knowledge in knowledge graph is regarded as a promising approach for mitigating the hallucination of large language models (LLMs). Existing methods usually only use the user's input to query the knowledge graph, thus failing to address the factual hallucination generated by LLMs during its reasoning process. To address this problem, this paper proposes Knowledge Graph-based Retrofitting (KGR), a new framework that incorporates LLMs with KGs to mitigate factual hallucination during the reasoning process by retrofitting the initial draft responses of LLMs based on the factual knowledge stored in KGs. Specifically, KGR leverages LLMs to extract, select, validate, and retrofit factual statements within the model-generated responses, which enables an autonomous knowledge verifying and refining procedure without any additional manual efforts. Experiments show that KGR can significantly improve the performance of LLMs on factual QA benchmarks especially when involving complex reasoning processes, which demonstrates the necessity and effectiveness of KGR in mitigating hallucination and enhancing the reliability of LLMs.

show abstract

“…ML tasks performed well on the SQuAD dataset, released less than a year ago. A logistic regression (LR) model based on linguistic features by Liu et al (2023) in June 2016 achieved an F1-score of 51%, up from 20%. The author reaches 77.3% F1 using BiDAF encoding, a bidirectional LSTM, and multistage decoding ( Chen et al, 2017 ).…”

Section: Literature Surveymentioning

confidence: 99%

FLMatchQA: a recursive neural network-based question answering with customized federated learning model

2024

PeerJ Computer Science

View full text Add to dashboard Cite

More sophisticated data access is possible with artificial intelligence (AI) techniques such as question answering (QA), but regulations and privacy concerns have limited their use. Federated learning (FL) deals with these problems, and QA is a viable substitute for AI. The utilization of hierarchical FL systems is examined in this research, along with an ideal method for developing client-specific adapters. The User Modified Hierarchical Federated Learning Model (UMHFLM) selects local models for users’ tasks. The article suggests employing recurrent neural network (RNN) as a neural network (NN) technique for learning automatically and categorizing questions based on natural language into the appropriate templates. Together, local and global models are developed, with the worldwide model influencing local models, which are, in turn, combined for personalization. The method is applied in natural language processing pipelines for phrase matching employing template exact match, segmentation, and answer type detection. The (SQuAD-2.0), a DL-based QA method for acquiring knowledge of complicated SPARQL test questions and their accompanying SPARQL queries across the DBpedia dataset, was used to train and assess the model. The SQuAD2.0 datasets evaluate the model, which identifies 38 distinct templates. Considering the top two most likely templates, the RNN model achieves template classification accuracy of 92.8% and 61.8% on the SQuAD2.0 and QALD-7 datasets. A study on data scarcity among participants found that FL Match outperformed BERT significantly. A MAP margin of 2.60% exists between BERT and FL Match at a 100% data ratio and an MRR margin of 7.23% at a 20% data ratio.

show abstract

WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human Preferences

Cited by 17 publications

References 23 publications

Datasets for Large Language Models: A Comprehensive Survey

Datasets for Large Language Models: A Comprehensive Survey

Mitigating Large Language Model Hallucinations via Autonomous Knowledge Graph-Based Retrofitting

FLMatchQA: a recursive neural network-based question answering with customized federated learning model

Contact Info

Product

Resources

About