Tianqing Fang scite author profile

Tianqing Fang

4Publications

40Citation Statements Received

85Citation Statements Given

How they've been cited

How they cite others

161

Affiliations

University of Hong Kong, Hong Kong University of Science and Technology

Publications

Order By: Most citations

Probing Toxic Content in Large Pre-Trained Language Models

Ousidhoum¹,

Zhao²,

Fang³

et al. 2021

View full text Add to dashboard Cite

Large pre-trained language models (PTLMs) have been shown to carry biases towards different social groups which leads to the reproduction of stereotypical and toxic content by major NLP systems. We propose a method based on logistic regression classifiers to probe English, French, and Arabic PTLMs and quantify the potentially harmful content that they convey with respect to a set of templates. The templates are prompted by a name of a social group followed by a cause-effect relation. We use PTLMs to predict masked tokens at the end of a sentence in order to examine how likely they enable toxicity towards specific communities. We shed the light on how such negative content can be triggered within unrelated and benign contexts based on evidence from a large-scale study, then we explain how to take advantage of our methodology to assess and mitigate the toxicity transmitted by PTLMs.

show abstract

ASER: Towards large-scale commonsense knowledge acquisition via higher-order selectional preference over eventualities

Zhang

Liu

Pan

et al. 2022

Artificial Intelligence

View full text Add to dashboard Cite

DISCOS: Bridging the Gap between Discourse Knowledge and Commonsense Knowledge

Fang

Zhang

Wang

et al. 2021

View full text Add to dashboard Cite

Weakly Supervised Text Classification using Supervision Signals from a Language Model

Zeng¹,

Ni²,

Fang³

et al. 2022

View full text Add to dashboard Cite

Solving text classification in a weakly supervised manner is important for real-world applications where human annotations are scarce. In this paper, we propose to query a masked language model with cloze style prompts to obtain supervision signals. We design a prompt which combines the document itself and "this article is talking about [MASK]." A masked language model can generate words for the [MASK] token. The generated words which summarize the content of a document can be utilized as supervision signals. We propose a latent variable model to learn a word distribution learner which associates generated words to pre-defined categories and a document classifier simultaneously without using any annotated data. Evaluation on three datasets, AG-News, 20Newsgroups, and UCINews, shows that our method can outperform baselines by 2%, 4%, and 3%.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tianqing Fang

Probing Toxic Content in Large Pre-Trained Language Models

ASER: Towards large-scale commonsense knowledge acquisition via higher-order selectional preference over eventualities

DISCOS: Bridging the Gap between Discourse Knowledge and Commonsense Knowledge

Weakly Supervised Text Classification using Supervision Signals from a Language Model

Contact Info

Product

Resources

About