FAME: fast and memory efficient multiple sequences alignment tool through compatible chain of roots

Naznooshsadat, Etminan; Parvinnia, Elham; Sharifi-Zarchi, Ali

doi:10.1093/bioinformatics/btaa175

Cited by 10 publications

(4 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To test FMAlign2’s performance on real datasets, we choose the long and similar datasets to serve as our benchmark. This dataset provided by Naznooshsadat et al (2020) includes five sequence sets of Variola ( VARV ), Mycoplasma genitalium ( M.genitalium ), Mycoplasma bovis ( M.bovis ), Streptococcus pneumoniae ( S.pneumoniae ), and Escherichia coli ( E.coli ). Each set contains an equal number of sequences but differs in average lengths, allowing us to assess the performance of the methods concerning the sequence length.…”

Section: Resultsmentioning

confidence: 99%

“…Unlike FAME ( Naznooshsadat et al 2020 ) and FMAlign ( Liu et al 2022 ), which use global chains, FMAlign2 segments sequences utilizing partial chains that appear in a subset of sequences. A global chain refers to a chain that exists in all sequences, with its substrings being completely identical across all sequences.…”

Section: Methodsmentioning

confidence: 99%

“…This strategy seeks common segments/minimizers to divide all the sequences and aligns every generated sub-sequence in parallel using the existing MSA method, enabling MSA methods to handle ultralong sequences more effectively. FAME ( Naznooshsadat et al 2020 ), a novel vertical-division-based method for aligning similar sequences, has gained attention in recent years. This method utilizes hash tables to detect k-mers and minimizers, potentially incorporating nonoverlapping anchors into a single chain.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

FMAlign2: a novel fast multiple nucleotide sequence alignment method for ultralong datasets

Zhang,

Liu,

Wei

et al. 2024

Bioinformatics

View full text Add to dashboard Cite

Motivation In bioinformatics, multiple sequence alignment (MSA) is a crucial task. However, conventional methods often struggle with aligning ultralong sequences. To address this issue, researchers have designed MSA methods rooted in a vertical division strategy, which segments sequence data for parallel alignment. A prime example of this approach is FMAlign, which utilizes the FM-index to extract common seeds and segment the sequences accordingly. Results FMAlign2 leverages the suffix array to identify maximal exact matches, redefining the approach of FMAlign from searching for global chains to partial chains. By employing a vertical division strategy, large-scale problem is deconstructed into manageable tasks, enabling parallel execution of subMSA. Furthermore, sequence-profile alignment and refinement are incorporated to concatenate subsets, yielding the final result seamlessly. Compared to FMAlign, FMAlign2 markedly augments the segmentation of sequences and significantly reduces the time while maintaining accuracy, especially on ultralong datasets. Importantly, FMAlign2 enhances existing MSA methods by conferring the capability to handle sequences reaching billions in length within an acceptable time frame. Availability Source code and datasets are available at https://github.com/malabz/FMAlign2 and https://zenodo.org/records/10435770. Contact pingluzhang@outlook.com Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

FMAlign2: a novel fast multiple nucleotide sequence alignment method for ultralong datasets

Zhang,

Liu,

Wei

et al. 2024

Bioinformatics

View full text Add to dashboard Cite

show abstract

“…For STP samples with more than one sequence per amplicon per sample, the sequence with the highest read count was used. The concatenated sequences were aligned using the long sequence aligner FAME 45 . Sites with homology greater than 90% and sites containing more than 50% gaps were removed.…”

Section: Principal Component Analysis (Pca)mentioning

confidence: 99%

Genetic surveillance reveals low, sustained malaria transmission with clonal replacement in Sao Tome and Principe

Chen,

Ng,

Garcia

et al. 2024

Preprint

View full text Add to dashboard Cite

Despite efforts to eliminate malaria in Sao Tome and Principe (STP), cases have recently increased. Understanding residual transmission structure is crucial for developing effective elimination strategies. This study collected surveillance data and generated amplicon sequencing data from 980 samples between 2010 and 2016 to examine the genetic structure of the parasite population. The mean multiplicity of infection (MOI) was 1.3, with 11% polyclonal infections, indicating low transmission intensity. Temporal trends of these genetic metrics did not align with incidence rates, suggesting that changes in genetic metrics may not straightforwardly reflect changes in transmission intensity, particularly in low transmission settings where genetic drift and importation have a substantial impact. While 88% of samples were genetically linked, continuous turnover in genetic clusters and changes in drug-resistance haplotypes were observed. Principal component analysis revealed some STP samples were genetically similar to those from Central and West Africa, indicating possible importation. These findings highlight the need to prioritize several interventions such as targeted interventions against transmission hotspots, reactive case detection, and strategies to reduce the introduction of new parasites into this island nation as it approaches elimination. This study also serves as a case study for implementing genetic surveillance in a low transmission setting.

show abstract

Biological sequence analysis

et al. 2022

View full text Add to dashboard Cite

FAME: fast and memory efficient multiple sequences alignment tool through compatible chain of roots

Cited by 10 publications

References 22 publications

FMAlign2: a novel fast multiple nucleotide sequence alignment method for ultralong datasets

FMAlign2: a novel fast multiple nucleotide sequence alignment method for ultralong datasets

Genetic surveillance reveals low, sustained malaria transmission with clonal replacement in Sao Tome and Principe

Biological sequence analysis

Contact Info

Product

Resources

About