Agnieszka Szmurło scite author profile

Agnieszka Szmurło

3Publications

25Citation Statements Received

19Citation Statements Given

How they've been cited

How they cite others

Affiliations

Institute of Computer Science, Warsaw University of Technology, Children's Memorial Health Institute

Publications

Order By: Most citations

SeQuiLa: an elastic, fast and scalable SQL-oriented solution for processing and querying genomic intervals

Wiewiórka

Leśniewska

Szmurło

et al. 2018

View full text Add to dashboard Cite

Summary Efficient processing of large-scale genomic datasets has recently become possible due to the application of ‘big data’ technologies in bioinformatics pipelines. We present SeQuiLa—a distributed, ANSI SQL-compliant solution for speedy querying and processing of genomic intervals that is available as an Apache Spark package. Proposed range join strategy is significantly (∼22×) faster than the default Apache Spark implementation and outperforms other state-of-the-art tools for genomic intervals processing. Availability and implementation The project is available at http://biodatageeks.org/sequila/. Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

Comparison of kNN and k-means optimization methods of reference set selection for improved CNV callers performance

et al. 2019

View full text Add to dashboard Cite

Background There are over 25 tools dedicated for the detection of Copy Number Variants (CNVs) using Whole Exome Sequencing (WES) data based on read depth analysis. The tools reported consist of several steps, including: (i) calculation of read depth for each sequencing target, (ii) normalization, (iii) segmentation and (iv) actual CNV calling. The essential aspect of the entire process is the normalization stage, in which systematic errors and biases are removed and the reference sample set is used to increase the signal-to-noise ratio. Although some CNV calling tools use dedicated algorithms to obtain the optimal reference sample set, most of the advanced CNV callers do not include this feature. To our knowledge, this work is the first attempt to assess the impact of reference sample set selection on CNV detection performance. Methods We used WES data from the 1000 Genomes project to evaluate the impact of various methods of reference sample set selection on CNV calling performance of three chosen state-of-the-art tools: CODEX, CNVkit and exomeCopy. Two naive solutions (all samples as reference set and random selection) as well as two clustering methods (k-means and k nearest neighbours (kNN) with a variable number of clusters or group sizes) have been evaluated to discover the best performing sample selection method. Results and Conclusions The performed experiments have shown that the appropriate selection of the reference sample set may greatly improve the CNV detection rate. In particular, we found that smart reduction of reference sample size may significantly increase the algorithms’ precision while having negligible negative effect on sensitivity. We observed that a complete CNV calling process with the k-means algorithm as the selection method has significantly better time complexity than kNN-based solution. Electronic supplementary material The online version of this article (10.1186/s12859-019-2889-z) contains supplementary material, which is available to authorized users.

show abstract

Case of pyoderma gangrenosum with visceral involvement: severely recurrent disease with lung, spleen, mesorectum and subcutaneous tissue involvement

Kędzierska

Szmurło

Szymańska

et al. 2020

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.