BackgroundMany gram-negative bacteria use type III secretion systems (T3SSs) to translocate effector proteins into host cells. T3SS effectors can give some bacteria a competitive edge over others within the same environment and can help bacteria to invade the host cells and allow them to multiply rapidly within the host. Therefore, developing efficient methods to identify effectors scattered in bacterial genomes can lead to a better understanding of host-pathogen interactions and ultimately to important medical and biotechnological applications.ResultsWe used 21 genomic and proteomic attributes to create a precise and reliable T3SS effector prediction method called Genome Search for Effectors Tool (GenSET). Five machine learning algorithms were trained on effectors selected from different organisms and a trained (voting) algorithm was then applied to identify other effectors present in the genome testing sets from the same (GenSET Phase 1) or different (GenSET Phase 2) organism. Although a select group of attributes that included the codon adaptation index, probability of expression in inclusion bodies, N-terminal disorder, and G + C content (filtered) were better at discriminating between positive and negative sets, algorithm performance was better when all 21 attributes (unfiltered) were used. Performance scores (sensitivity, specificity and area under the curve) from GenSET Phase 1 were better than those reported for six published methods. More importantly, GenSET Phase 1 ranked more known effectors (70.3%) in the top 40 ranked proteins and predicted 10–80% more effectors than three available programs in three of the four organisms tested. GenSET Phase 2 predicted 43.8% effectors in the top 40 ranked proteins when tested on four related or unrelated organisms. The lower prediction rates from GenSET Phase 2 may be due to the presence of different translocation signals in effectors from different T3SS families.ConclusionsThe species-specific GenSET Phase 1 method offers an alternative approach to T3SS effector prediction that can be used with other published programs to improve effector predictions. Additionally, our approach can be applied to predict effectors of other secretion systems as long as these effectors have translocation signals embedded in their sequences.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-016-3363-1) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.