This paper presents VTN, a transformer-based framework for video recognition. Inspired by recent developments in vision transformers, we ditch the standard approach in video action recognition that relies on 3D ConvNets and introduce a method that classifies actions by attending to the entire video sequence information. Our approach is generic and builds on top of any given 2D spatial network. In terms of wall runtime, it trains 16.1× faster and runs 5.1× faster during inference while maintaining competitive accuracy compared to other state-of-the-art methods. It enables whole video analysis, via a single end-to-end pass, while requiring 1.5× fewer GFLOPs. We report competitive results on Kinetics-400 and present an ablation study of VTN properties and the trade-off between accuracy and inference speed. We hope our approach will serve as a new baseline and start a fresh line of research in the video recognition domain. Code and models will be available soon.
PurposePhenotype information is crucial for the interpretation of genomic variants. So far it has only been accessible for bioinformatics workflows after encoding into clinical terms by expert dysmorphologists.MethodsHere, we introduce an approach driven by artificial intelligence that uses portrait photographs for the interpretation of clinical exome data. We measured the value added by computer-assisted image analysis to the diagnostic yield on a cohort consisting of 679 individuals with 105 different monogenic disorders. For each case in the cohort we compiled frontal photos, clinical features, and the disease-causing variants, and simulated multiple exomes of different ethnic backgrounds.ResultsThe additional use of similarity scores from computer-assisted analysis of frontal photos improved the top 1 accuracy rate by more than 20–89% and the top 10 accuracy rate by more than 5–99% for the disease-causing gene.ConclusionImage analysis by deep-learning algorithms can be used to quantify the phenotypic similarity (PP4 criterion of the American College of Medical Genetics and Genomics guidelines) and to advance the performance of bioinformatics pipelines for exome analysis.
High throughput approaches are continuously progressing and have become a major part of clinical diagnostics. Still, the critical process of detailed phenotyping and gathering clinical information has not changed much in the last decades. Forms of next generation phenotyping (NGP) are needed to increase further the value of any kind of genetic approaches, including timely consideration of (molecular) cytogenetics during the diagnostic quest. As NGP we used in this study the facial dysmorphology novel analysis (FDNA) technology to automatically identify facial phenotypes associated with Emanuel (ES) and Pallister-Killian Syndrome (PKS) from 2D facial photos. The comparison between ES or PKS and normal individuals expressed a full separation between the cohorts. Our results show that NPG is able to help in the clinic, and could reduce the time patients spend in diagnostic odyssey. It also helps to differentiate ES or PKS from each other and other patients with small supernumerary marker chromosomes, especially in countries with no access to more sophisticated genetic approaches apart from banding cytogenetics. Inclusion of more facial pictures of patient with sSMC, like isochromosome-18p-, cat-eye-syndrome or others may contribute to higher detection rates in future.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.