Mixture of Experts layers (MoEs) enable efficient scaling of language models through conditional computation. This paper presents a detailed empirical study of how autoregressive MoE language models scale in comparison with dense models in a wide range of settings: in-and out-of-domain language modeling, zero-and few-shot priming, and full finetuning. With the exception of fine-tuning, we find MoEs to be substantially more compute efficient. At more modest training budgets, MoEs can match the performance of dense models using ∼4 times less compute. This gap narrows at scale, but our largest MoE model (1.1T parameters) consistently outperforms a compute-equivalent dense model (6.7B parameters). Overall, this performance gap varies greatly across tasks and domains, suggesting that MoE and dense models generalize differently in ways that are worthy of future study. We make our code and models publicly available for research use. 1 * Equal contribution. Authors listed alphabetically.
Aim
Current literature lacks a definitive threshold of idiopathic premature ventricular complex (PVC) burden for predicting cardiomyopathy (CMP). The main objective of the present study was to evaluate relationship between the PVC burden and left ventricular ejection fraction (LVEF).
Method
This multicenter, cross‐sectional study included 341 consecutive patients with more than 1,000 idiopathic PVC in 24 hr of Holter monitoring admitted to the cardiology clinics between January 2019 and May 2019 in the nineteen different centers. The primary outcome was the LVEF measured during the echocardiographic examination.
Result
Overall, the median age was 50 (38–60) and 139 (49.4%) were female. Percentage of median PVC burden was 9% (IQR: 4%–17.4%). Median LVEF was found 60% (55–65). We used proportional odds logistic regression method to examine the relationship between continuous LVEF and candidate predictors. Increase in PVC burden (%) (regression coefficient (RE) −0.644 and 95% CI −1.063, –0.225, p < .001), PVC QRS duration (RE‐0.191 and 95% CI −0.529, 0.148, p = .049), and age (RE‐0.249 and 95% CI −0.442, −0.056, p = .018) were associated with decrease in LVEF. This inverse relationship between the PVC burden and LVEF become more prominent when PVC burden was above 5%. A nomogram developed to estimate the individual risk for decrease in LVEF.
Conclusion
Our study showed that increase in PVC burden %, age, and PVC QRS duration were independently associated with decrease in LVEF in patients with idiopathic PVC. Also, inverse relationship between PVC burden and LVEF was observed in lower PVC burden than previously known.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.