AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles (e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities, and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature.
BackgroundAlthough malnutrition and sarcopenia are prevalent in cirrhosis, their impact on outcomes following liver transplantation is not well documented.MethodsThe associations of nutritional status and sarcopenia with post‐transplant infections, requirement for mechanical ventilation, intensive care (ICU) and hospital stay, and 1 year mortality were assessed in 232 consecutive transplant recipients. Nutritional status and sarcopenia were assessed using the Royal Free Hospital‐Global Assessment (RFH‐GA) tool and the L3‐psoas muscle index (L3‐PMI) on CT, respectively.ResultsA wide range of RFH‐SGA and L3‐PMI were observed within similar Model for End‐stage Liver Disease (MELD) sub‐categories. Malnutrition and sarcopenia were independent predictors of all outcomes. Post‐transplant infections were associated with MELD (OR = 1.055, 95%CI = 1.002–1.11) and severe malnutrition (OR = 6.55, 95%CI = 1.99–21.5); ventilation > 24 h with MELD (OR = 1.1, 95%CI = 1.036–1.168), severe malnutrition (OR = 8.5, 95%CI = 1.48–48.87) and suboptimal donor liver (OR = 2.326, 95%CI = 1.056–5.12); ICU stay > 5 days, with age (OR = 1.054, 95%CI = 1.004–1.106), MELD (OR = 1.137, 95%CI = 1.057–1.223) and severe malnutrition (OR = 7.46, 95%CI = 1.57–35.43); hospital stay > 20 days with male sex (OR = 2.107, 95%CI = 1.004–4.419) and L3‐PMI (OR = 0.996, 95%CI = 0.994–0.999); 1 year mortality with L3‐PMI (OR = 0.996, 95%CI = 0.992–0.999). Patients at the lowest L3‐PMI receiving suboptimal grafts had longer ICU/hospital stay and higher incidence of infections.ConclusionsMalnutrition and sarcopenia are associated with early post‐liver transplant morbidity/mortality. Allocation indices do not include nutritional status and may jeopardize outcomes in nutritionally compromised individuals.
We study the settings for which deep contextual embeddings (e.g., BERT) give large improvements in performance relative to classic pretrained embeddings (e.g., GloVe), and an even simpler baseline-random word embeddings-focusing on the impact of the training set size and the linguistic properties of the task. Surprisingly, we find that both of these simpler baselines can match contextual embeddings on industry-scale data, and often perform within 5 to 10% accuracy (absolute) on benchmark tasks. Furthermore, we identify properties of data for which contextual embeddings give particularly large gains: language containing complex structure, ambiguous word usage, and words unseen in training.
A challenge for named entity disambiguation (NED), the task of mapping textual mentions to entities in a knowledge base, is how to disambiguate entities that appear rarely in the training data, termed tail entities. Humans use subtle reasoning patterns based on knowledge of entity facts, relations, and types to disambiguate unfamiliar entities. Inspired by these patterns, we introduce Bootleg, a self-supervised NED system that is explicitly grounded in reasoning patterns for disambiguation. We define core reasoning patterns for disambiguation, create a learning procedure to encourage the self-supervised model to learn the patterns, and show how to use weak supervision to enhance the signals in the training data. Encoding the reasoning patterns in a simple Transformer architecture, Bootleg meets or exceeds state-of-the-art on three NED benchmarks. We further show that the learned representations from Bootleg successfully transfer to other non-disambiguation tasks that require entity-based knowledge: we set a new state-ofthe-art in the popular TACRED relation extraction task by 1.0 F1 points and demonstrate up to 8% performance lift in highly optimized production search and assistant tasks at a major technology company.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.