In an attempt to develop a novel strategy for the identification of new members of protein families by in silico approaches, we have developed a semi-automated procedure of consecutive PSI-BLAST (Position-Specific-Iterated Basic Local Alignment Search Tool) searches incorporating identificiation as well as subsequent validation of putative candidates. For a proof of concept study we chose the search for novel members of the claudin family. The initial step was an iterated PSI-BLAST search starting with the PMP22_Claudin domain of each known member of the claudin family against the human part of the RefSeq Database. Putative new claudin domains derived from the converged list were evaluated by a validating PSI-BLAST in which each sequence was assessed for finding back the starting set of known claudin domains. The local PSI-BLAST searches and validation were automated by a set of PERL scripts. With this strategy a total of three additional putative claudin domains in three different proteins were identified. One of them was subjected to further characterization and was shown to exhibit claudin-like features in terms of protein structure and expression pattern. The strategy we present is an efficient and versatile tool to identify novel members of domain-sharing protein families. Low rates of false positives achieved by inclusion of a validation step into the in silico procedure make this strategy particularly attractive to select candidates for subsequent labor-intensive wet bench characterization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.