Continuous Learning in a Hierarchical Multiscale Neural Network

Wolf, Thomas; Chaumond, Julien; Delangue, Clément

doi:10.18653/v1/p18-2001

Cited by 28 publications

(44 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…We fine-tuned BERT on our training dataset, which consisted of sentences (concepts and feature combinations) and corresponding true and false labels. We did this using the Hugging Face BertForSequenceClassification transformers package for Pytorch (Wolf et al, 2019). The trained BERT model outputted a 2-dimensional vector corresponding to activation in favor of true and activation in favor of false.…”

Section: Model Specificationmentioning

confidence: 99%

Transformer Networks of Human Conceptual Knowledge

Bhatia¹,

Richie²

2020

Preprint

View full text Add to dashboard Cite

We present a computational model capable of simulating aspects of human knowledge for thousands of real-world concepts. Our approach involves fine-tuning a transformer network for natural language understanding on participant-generated feature norms. We show that such a model can successfully extrapolate from its training dataset, and predict human knowledge for novel concepts and features. We also apply our model to stimuli from twenty-three previous experiments in semantic cognition research, and show that it reproduces fifteen classic findings involving semantic verification, concept typicality, feature distribution, and semantic similarity. We interpret these findings using established properties of classic connectionist networks. The success of our approach shows how the combination of natural language data and psychological data can be used to build cognitive models with rich world knowledge. Such models can be used in the service of new psychological applications, such as the cognitive process modeling of naturalistic semantic verification and knowledge retrieval, as well as the modeling of real-world categorization, decision making, and reasoning.

show abstract

Section: Model Specificationmentioning

confidence: 99%

Transformer Networks of Human Conceptual Knowledge

Bhatia¹,

Richie²

2020

Preprint

View full text Add to dashboard Cite

show abstract

“…For the BERT‐based classification model, we use the Simple‐Representations PyPI library, which enables its users to extract text representations form the pre‐trained models and feed them directly as features to a Keras built neural network. We extract representations from the BERT model available at the huggingface.co (Wolf et al, 2019) website. Specifically, we use the “bert‐base‐multilingual‐uncased” version for both the English and Spanish languages.…”

Section: Methodsmentioning

confidence: 99%

Authorship analysis of English and Spanish tweets

Al-Rashdan

Abdullah

Al‐Ayyoub

et al. 2020

Proc Assoc Inf Sci Technol

View full text Add to dashboard Cite

With the countless advantages gained from the free, open, and ubiquitous nature of online social networks, they do come with their own set of problems and challenges. E.g., they represent a fertile ground for fake accounts and autonomous bots to spread fake news. Revealing whether some text content is written by a bot or a human would be of great value in the fight against the spreading of fake news and misinformation. In this paper, we address this problem using different Machine Learning (ML) techniques: conventional, Deep Learning (DL) based and Transfer Learning (TL) based. Using the dataset of the well‐known PAN 2019 Author Profiling Task, we show how relatively simple conventional ML methods can outperform DL and TL based ones for different languages (English and Spanish). In fact, our simplest model performs closely to the state‐of‐the‐art (SOTA) systems for the English language and even outperforms the SOTA systems for the Spanish language.

show abstract

“…It is therefore challenging to draw general conclusions on recurrent networks in CL from this kind of experiments. Examples of applications are online learning of language models where new words are added incrementally [71,113,63], continual learning in neural machine translation on multiple languages [102] and sentiment analysis on multiple domains [77].…”

Section: Survey Of Continual Learning In Recurrent Modelsmentioning

confidence: 99%

“…Stroke MNIST [99,33] stroke classification SIT+(NI/NC) Quick, Draw! † stroke classification SIT+NC MNIST-like [26] [25] † object classification SIT+(NI/NC) CORe50 [88] object recognition SIT+(NI/NC) MNLI [10] domain adaptation SIT+NI MDSD [77] sentiment analysis SIT+NI WMT17 [14] NMT MT+NC OpenSubtitles18 [73] NMT MT+NC WIPO COPPA-V2 [60] [102] NMT MT+NC CALM [63] language modeling Online WikiText-2 [113] language modeling SIT+NI/NC Audioset [26,33] sound classification SIT+NC LibriSpeech, Switchboard [114] speech recognition (SIT/MT)+NC Synthetic Speech Commands † sound classification SIT+NC Acrobot [62] reinforcement learning MT+NI Table 3: Datasets used in continual learning for sequential data processing. The scenario column indicates in which scenario the dataset has been used (or could be used when the related paper does not specify this information).…”

Section: Dataset Application Scenariomentioning

confidence: 99%

Continual Learning for Recurrent Neural Networks: an Empirical Evaluation

Cossu,

Carta,

Lomonaco

et al. 2021

Preprint

View full text Add to dashboard Cite

Learning continuously during all model lifetime is fundamental to deploy machine learning solutions robust to drifts in the data distribution. Advances in Continual Learning (CL) with recurrent neural networks could pave the way to a large number of applications where incoming data is non stationary, like natural language processing and robotics. However, the existing body of work on the topic is still fragmented, with approaches which are application-specific and whose assessment is based on heterogeneous learning protocols and datasets. In this paper, we organize the literature on CL for sequential data processing by providing a categorization of the contributions and a review of the benchmarks. We propose two new benchmarks for CL with sequential data based on existing datasets, whose characteristics resemble real-world applications. We also provide a broad empirical evaluation of CL and Recurrent Neural Networks in class-incremental scenario, by testing their ability to mitigate forgetting with a number of different strategies which are not specific to sequential data processing. Our results highlight the key role played by the sequence length and the importance of a clear specification of the CL scenario.

show abstract

Continuous Learning in a Hierarchical Multiscale Neural Network

Cited by 28 publications

References 30 publications

Transformer Networks of Human Conceptual Knowledge

Transformer Networks of Human Conceptual Knowledge

Authorship analysis of English and Spanish tweets

Continual Learning for Recurrent Neural Networks: an Empirical Evaluation

Contact Info

Product

Resources

About