Deep Learning Models for Representing Out-of-Vocabulary Words

Lochter, Johannes V.; Silva, Renato Moraes; Almeida, Tiago A.

doi:10.1007/978-3-030-61377-8_29

Cited by 13 publications

(7 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A number of challenges and limitations have been focused on, including biased data, overreliance on surface-level patterns, limited common sense, poor ability to reason and interpret feedback [506], [507]. Other issues include; the need for vast amounts of data and computational resources [508], limited generalizability [509], lack of interpretability [510], difficulty with rare or out-of-vocabulary words, limited understanding of syntax and grammar [511], and limited domainspecific knowledge [512].…”

Section: Challenges and Limitations Of Large Language Modelsmentioning

confidence: 99%

Large Language Models: A Comprehensive Survey of its Applications, Challenges, Limitations, and Future Prospects

Hadi,

tashi,

Qureshi

et al. 2023

Preprint

View full text Add to dashboard Cite

<p>Within the vast expanse of computerized language processing, a revolutionary entity known as Large Language Models (LLMs) has emerged, wielding immense power in its capacity to comprehend intricate linguistic patterns and conjure coherent and contextually fitting responses. Large language models (LLMs) are a type of artificial intelligence (AI) that have emerged as powerful tools for a wide range of tasks, including natural language processing (NLP), machine translation, and question-answering. This survey paper provides a comprehensive overview of LLMs, including their history, architecture, training methods, applications, and challenges. The paper begins by discussing the fundamental concepts of generative AI and the architecture of generative pre- trained transformers (GPT). It then provides an overview of the history of LLMs, their evolution over time, and the different training methods that have been used to train them. The paper then discusses the wide range of applications of LLMs, including medical, education, finance, and engineering. It also discusses how LLMs are shaping the future of AI and how they can be used to solve real-world problems. The paper then discusses the challenges associated with deploying LLMs in real-world scenarios, including ethical considerations, model biases, interpretability, and computational resource requirements. It also highlights techniques for enhancing the robustness and controllability of LLMs, and addressing bias, fairness, and generation quality issues. Finally, the paper concludes by highlighting the future of LLM research and the challenges that need to be addressed in order to make LLMs more reliable and useful. This survey paper is intended to provide researchers, practitioners, and enthusiasts with a comprehensive understanding of LLMs, their evolution, applications, and challenges. By consolidating the state-of-the-art knowledge in the field, this survey serves as a valuable resource for further advancements in the development and utilization of LLMs for a wide range of real-world applications. The GitHub repo for this project is available at https://github.com/anas-zafar/LLM-Survey</p>

show abstract

Section: Challenges and Limitations Of Large Language Modelsmentioning

confidence: 99%

Large Language Models: A Comprehensive Survey of its Applications, Challenges, Limitations, and Future Prospects

Hadi,

tashi,

Qureshi

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…The difference between BPE and WordPiece comes mainly from how the subwords are assigned, where BPE chooses the most frequent byte pair and WordPiece chooses the the pair which maximises the likelihood of the training data. Other models try to learn to predict the meaning of an unknown word based on surrounding words, individual characters, or a combination of both (Lochter et al, 2020). Implementing an OOV solution which allows transfer learning of a pre-trained deep learning NLP encoder could potentiate more semantically accurate representations of technical language word embeddings, which in turn would improve the potential for TLS.…”

Section: Challenges and Solutionsmentioning

confidence: 99%

Technical Language Supervision for Intelligent Fault Diagnosis in Process Industry

Karl

Taal²,

Schnabel³

et al. 2022

IJPHM

View full text Add to dashboard Cite

In the process industry, condition monitoring systems with automated fault diagnosis methods assist human experts and thereby improve maintenance efficiency, process sustainability, and workplace safety. Improving the automated fault diagnosis methods using data and machine learning-based models is a central aspect of intelligent fault diagnosis (IFD). A major challenge in IFD is to develop realistic datasets with accurate labels needed to train and validate models, and to transfer models trained with labeled lab data to heterogeneous process industry environments. However, fault descriptions and work-orders written by domain experts are increasingly digitised in modern condition monitoring systems, for example in the context of rotating equipment monitoring. Thus, domain-specific knowledge about fault characteristics and severities exists as technical language annotations in industrial datasets. Furthermore, recent advances in natural language processing enable weakly supervised model optimisation using natural language annotations, most notably in the form of natural language supervision (NLS). This creates a timely opportunity to develop technical language supervision (TLS) solutions for IFD systems grounded in industrial data, for example as a complement to pre-training with lab data to address problems like overfitting and inaccurate out-of-sample generalisation. We surveyed the literature and identify a considerable improvement in the maturity of NLS over the last two years, facilitating applications beyond natural language; a rapid development of weak supervision methods; and transfer learning as a current trend in IFD which can benefit from these developments. Finally we describe a general framework for TLS and implement a TLS case study based on Sentence-BERT and contrastive learning based zero-shot inference on annotated industry data.

show abstract

“…A common method to deal with OOV words, used in for instance BERT [114] and GPT [147,148,149], is to input subwords and byte-pair encodings rather than the words themselves to the model. Other models try to learn to predict the meaning of an unknown word based on surrounding words, individual characters, or a combination of both [150]. Implementing an OOV solution which allows transfer learning of a pre-trained deep learning NLP encoder could potentiate more semantically accurate representations of technical language word embeddings, which in turn would improve the potential for TLS.…”

Section: Technical Language Processingmentioning

confidence: 99%

Technical Language Supervision for Intelligent Fault Diagnosis in Process Industry

Karl¹,

Taal²,

Schnabel³

et al. 2021

Preprint

View full text Add to dashboard Cite

In the process industry, condition monitoring systems with automated fault diagnosis methods assist human experts and thereby improve maintenance efficiency, process sustainability, and workplace safety. Improving the automated fault diagnosis methods using data and machine learning-based models is a central aspect of intelligent fault diagnosis (IFD). A major challenge in IFD is to develop realistic datasets with accurate labels needed to train and validate models, and to transfer models trained with labeled lab data to heterogeneous process industry environments. However, fault descriptions and work-orders written by domain experts are increasingly digitized in modern condition monitoring systems, for example in the context of rotating equipment monitoring. Thus, domain-specific knowledge about fault characteristics and severities exists as technical language annotations in industrial datasets. Furthermore, recent advances in natural language processing enable weakly supervised model optimization using natural language annotations, most notably in the form of natural language supervision (NLS). This creates a timely opportunity to develop technical language supervision (TLS) solutions for IFD systems grounded in industrial data, for example as a complement to pre-training with lab data to address problems like overfitting and inaccurate out-ofsample generalisation. We surveyed the literature and identify a considerable improvement in the maturity of NLS over the last two years, facilitating applications beyond natural language; a rapid development of weak supervision methods; and transfer learning as a current trend in IFD which can benefit from these developments. Finally, we describe a framework for integration of TLS in IFD which is inspired by recent NLS innovations.

show abstract

Deep Learning Models for Representing Out-of-Vocabulary Words

Cited by 13 publications

References 19 publications

Large Language Models: A Comprehensive Survey of its Applications, Challenges, Limitations, and Future Prospects

Large Language Models: A Comprehensive Survey of its Applications, Challenges, Limitations, and Future Prospects

Technical Language Supervision for Intelligent Fault Diagnosis in Process Industry

Technical Language Supervision for Intelligent Fault Diagnosis in Process Industry

Contact Info

Product

Resources

About