Findings of the Association for Computational Linguistics: ACL 2022 2022
DOI: 10.18653/v1/2022.findings-acl.52
|View full text |Cite
|
Sign up to set email alerts
|

Human Language Modeling

Abstract: Natural language is generated by people, yet traditional language modeling views words or documents as if generated independently. Here, we propose human language modeling (HuLM), a hierarchical extension to the language modeling problem whereby a humanlevel exists to connect sequences of documents (e.g. social media messages) and capture the notion that human language is moderated by changing human states. We introduce, HaRT, a large-scale transformer model for the HULM task, pre-trained on approximately 100,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(8 citation statements)
references
References 32 publications
0
8
0
Order By: Relevance
“…It is possible to analyze at the sentence level but still include the context of the surrounding sentences and the specific participant. We tested a Human-Aware Recurrent Transformer (HaRT; Soni et al, 2022) LLM, a model with a modified GPT-based architecture that enables processing language in the context of each participant. This model learns a recurrently updating latent participant representation (user state) derived from the language of each participant.…”
Section: Sentence Embeddingsmentioning
confidence: 99%
“…It is possible to analyze at the sentence level but still include the context of the surrounding sentences and the specific participant. We tested a Human-Aware Recurrent Transformer (HaRT; Soni et al, 2022) LLM, a model with a modified GPT-based architecture that enables processing language in the context of each participant. This model learns a recurrently updating latent participant representation (user state) derived from the language of each participant.…”
Section: Sentence Embeddingsmentioning
confidence: 99%
“…The typical focus in NLP is to model language itself using huge amounts of data and to employ these language models to solve language tasks (e.g., GLUE-tasks). However, human-level AI models the individual behind the language (54,55). Modeling a person behind a text may include assessing their depression or suicide risk (54).…”
Section: Leveraging Big Data Information For Small Samplesmentioning
confidence: 99%
“…Recent works (Lynn et al, 2020;Matero et al, 2021b;Soni et al, 2022) have highlighted the importance of incorporating author context into the message representations through the use of history and multi-level modeling. We use the Human aware Recurrent Transformer model (Soni et al, 2022) which is built on GPT2 (Radford et al, 2019), to produce message representations that encode the latent representation of the author as well. We adapted HaRT in two ways.…”
Section: Task Amentioning
confidence: 99%
“…Here, we explore two types of modeling techniques that can capture changes over time: Humanaware Recurrent Transformers (Soni et al, 2022) and difference embeddings. These techniques were used as part of the WWBP-SQT-lite 1 system for the CLPsych 2022 shared tasks (Tsakalidis et al, 2022a): (Task A) modeling user state changes over time (Tsakalidis et al, 2022b), and (Task B) the suicide risk associated with the user (Shing et al, 2018), our contributions are as follows: (a) evaluation of Human-aware Recurrent Transformers (HaRT) and difference embeddings for Task A (b) exploring SoTA methods for predicting state escalations and switches, and (c) exploring theoretically related linguistic assessments.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation