Building open-domain chatbots is a challenging area for machine learning research. While prior work has shown that scaling neural models in the number of parameters and the size of the data they are trained on gives improved results, we highlight other ingredients. Good conversation requires blended skills: providing engaging talking points, and displaying knowledge, empathy and personality appropriately, while maintaining a consistent persona. We show that large scale models can learn these skills when given appropriate training data and choice of generation strategy. We build variants of these recipes with 90M, 2.7B and 9.4B parameter models, and make our models and code publicly available. Human evaluations show our best models outperform existing approaches in multi-turn dialogue on engagingness and humanness measurements. We then discuss the limitations of this work by analyzing failure cases of our models.
Per- and polyfluorinated alkyl substances (PFASs) enter Arctic lakes through long-range atmospheric transport and local contamination, but their behavior in aquatic food webs at high latitudes is poorly understood. This study compared the concentrations of perfluorocarboxylates, perfluorosulfonates, and fluorotelomer sulfonates (FTS) in biotic and abiotic samples from six high Arctic lakes near Resolute Bay, Nunavut, Canada. Two of these lakes are known to be locally contaminated by a small airport and Arctic char (Salvelinus alpinus) from these lakes had over 100 times higher total [PFAS] when compared to fish from neighboring lakes. Perfluorononanoate (PFOA) and perfluorooctanesulfonate (PFOS) dominated in char, benthic chironomids (their main prey), and sediments, while pelagic zooplankton and water were dominated by lower chain acids and perfluorodecanesulfonate (PFDS). This study also provides the first measures of perfluoroethylcyclohexanesulfonate (PFECHS) and FTS compounds in water, sediment, juvenile char, and benthic invertebrates from lakes in the high Arctic. Negative relationships between [PFAS] and δ(15)N values (indicative of trophic position) within these food webs indicated no biomagnification. Overall, these results suggest that habitat use and local sources of contamination, but not trophic level, are important determinants of [PFAS] in biota from freshwater food webs in the Canadian Arctic.
The biomagnification behavior of perfluorinated carboxylates (PFCAs) and perfluorinated sulfonates (PFSAs) was studied in terrestrial food webs consisting of lichen and plants, caribou, and wolves from two remote northern areas in Canada. Six PFCAs with eight to thirteen carbons and perfluorooctane sulfonate (PFOS) were regularly detected in all species. Lowest concentrations were found for vegetation (0.02-0.26 ng/g wet weight (ww) sum (Σ) PFCAs and 0.002-0.038 ng/g ww PFOS). Wolf liver showed highest concentrations (10-18 ng/g ww ΣPFCAs and 1.4-1.7 ng/g ww PFOS) followed by caribou liver (6-10 ng/g ww ΣPFCAs and 0.7-2.2 ng/g ww PFOS). Biomagnification factors were highly tissue and substance specific. Therefore, individual whole body concentrations were calculated and used for biomagnification and trophic magnification assessment. Trophic magnification factors (TMF) were highest for PFCAs with nine to eleven carbons (TMF = 2.2-2.9) as well as PFOS (TMF = 2.3-2.6) and all but perfluorooctanoate were significantly biomagnified. The relationship of PFCA and PFSA TMFs with the chain length in the terrestrial food chain was similar to previous studies for Arctic marine mammal food web, but the absolute values of TMFs were around two times lower for this study than in the marine environment. This study demonstrates that challenges remain for applying the TMF approach to studies of biomagnification of PFCAs and PFSAs, especially for terrestrial animals.
Being engaging, knowledgeable, and empathetic are all desirable general qualities in a conversational agent. Previous work has introduced tasks and datasets that aim to help agents to learn those qualities in isolation and gauge how well they can express them. But rather than being specialized in one single quality, a good open-domain conversational agent should be able to seamlessly blend them all into one cohesive conversational flow. In this work, we investigate several ways to combine models trained towards isolated capabilities, ranging from simple model aggregation schemes that require minimal additional training, to various forms of multi-task training that encompass several skills at all training stages. We further propose a new dataset, Blended-SkillTalk, to analyze how these capabilities would mesh together in a natural conversation, and compare the performance of different architectures and training schemes. Our experiments show that multi-tasking over several tasks that focus on particular capabilities results in better blended conversation performance compared to models trained on a single skill, and that both unified or two-stage approaches perform well if they are constructed to avoid unwanted bias in skill selection or are fine-tuned on our new task.
We introduce VoxPopuli, a large-scale multilingual corpus providing 400K hours of unlabeled speech data in 23 languages. It is the largest open data to date for unsupervised representation learning as well as semisupervised learning. VoxPopuli also contains 1.8K hours of transcribed speeches in 15 languages and their aligned oral interpretations into 15 target languages totaling 17.3K hours. We provide speech recognition (ASR) baselines and validate the versatility of VoxPopuli unlabeled data in semisupervised ASR and speech-to-text translation under challenging out-of-domain settings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.