Qinyi Luo scite author profile

Qinyi Luo

5Publications

65Citation Statements Received

98Citation Statements Given

How they've been cited

118

How they cite others

168

Affiliations

University of Southern California, Southern California University for Professional Studies

Publications

Order By: Most citations

Prague: High-Performance Heterogeneity-Aware Asynchronous Decentralized Training

Luo

Zhuo

et al. 2020

View full text Add to dashboard Cite

Distributed deep learning training usually adopts All-Reduce as the synchronization mechanism for data parallel algorithms due to its high performance in homogeneous environment. However, its performance is bounded by the slowest worker among all workers, and is significantly slower in heterogeneous situations. AD-PSGD, a newly proposed synchronization method which provides numerically fast convergence and heterogeneity tolerance, suffers from deadlock issues and high synchronization overhead. Is it possible to get the best of both worlds -designing a distributed training method that has both high performance as All-Reduce in homogeneous environment and good heterogeneity tolerance as AD-PSGD?In this paper, we propose Ripples, a high-performance heterogeneity-aware asynchronous decentralized training approach. We achieve the above goal with intensive synchronization optimization, emphasizing the interplay between algorithm and system implementation. To reduce synchronization cost, we propose a novel communication primitive Partial All-Reduce that allows a large group of workers to synchronize quickly. To reduce synchronization conflict, we propose static group scheduling in homogeneous environment and simple techniques (Group Buffer and Group Division) to avoid conflicts with slightly reduced randomness. Our experiments show that in homogeneous environment, Ripples is 1.1× faster than the state-of-the-art implementation of All-Reduce, 5.1× faster than Parameter Server and 4.3× faster than AD-PSGD. In a heterogeneous setting, Ripples shows 2× speedup over All-Reduce, and still obtains 3× speedup over the Parameter Server baseline.

show abstract

SympleGraph: distributed graph processing with precise loop-carried dependency guarantee

Zhuo

Chen

Luo

et al. 2020

View full text Add to dashboard Cite

Identification of a stable major-effect quantitative trait locus for pre-harvest sprouting in common wheat (Triticum aestivum L.) via high-density SNP-based genotyping

Chen²,

Ou³

et al. 2022

Theor Appl Genet

View full text Add to dashboard Cite

Transfer Learning Between Concepts for Human Behavior Modeling: An Application to Sincerity and Deception Prediction

Luo

Gupta

Narayanan

2017

View full text Add to dashboard Cite

Transfer learning (TL) involves leveraging information from sources outside the domain at hand for enhancing model performances. Popular TL methods either directly use the data or adapt the models learned on out-of-domain resources and incorporate them within in-domain models. TL methods have shown promise in several applications such as text classification, crossdomain language classification and emotion recognition. In this paper, we propose TL methods to computational human behavioral trait modeling. Many behavioral traits are abstract constructs (e.g., sincerity of an individual), and are often conceptually related to other constructs (e.g., level of deception) making TL methods an attractive option for their modeling. We consider the problem of automatically predicting human sincerity and deception from behavioral data while leveraging transfer of knowledge from each other. We compare our methods against baseline models trained only on in-domain data. Our best models achieve an Unweighted Average Recall (UAR) of 72.02% in classifying deception (baseline: 69.64%). Similarly, applied methods achieve Spearman's/Pearson's correlation values of 49.37%/48.52% between true and predicted sincerity scores (baseline: 46.51%/41.58%), indicating the success and the potential of TL for such human behavior tasks.

show abstract

Hop

Luo

Lin

Zhuo

et al. 2019

View full text Add to dashboard Cite

Recent work has shown that decentralized algorithms can deliver superior performance over centralized ones in the context of machine learning. The two approaches, with the main difference residing in their distinct communication patterns, are both susceptible to performance degradation in heterogeneous environments. Although vigorous efforts have been devoted to supporting centralized algorithms against heterogeneity, little has been explored in decentralized algorithms regarding this problem. This paper proposes Hop, the first heterogeneity-aware decentralized training protocol. Based on a unique characteristic of decentralized training that we have identified, the iteration gap, we propose a queue-based synchronization mechanism that can efficiently implement backup workers and bounded staleness in the decentralized setting. To cope with deterministic slowdown, we propose skipping iterations so that the effect of slower workers is further mitigated. We build a prototype implementation of Hop on TensorFlow. The experiment results on CNN and SVM show significant speedup over standard decentralized training in heterogeneous settings. CCS Concepts • Computer systems organization → Distributed architectures; Heterogeneous (hybrid) systems; Special purpose systems; • Software and its engineering → Concurrency control.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Qinyi Luo

Prague: High-Performance Heterogeneity-Aware Asynchronous Decentralized Training

SympleGraph: distributed graph processing with precise loop-carried dependency guarantee

Identification of a stable major-effect quantitative trait locus for pre-harvest sprouting in common wheat (Triticum aestivum L.) via high-density SNP-based genotyping

Transfer Learning Between Concepts for Human Behavior Modeling: An Application to Sincerity and Deception Prediction

Hop

Contact Info

Product

Resources

About