The Transporter Classification Database (TCDB; http://www.tcdb.org) serves as a common reference point for transport protein research. The database contains more than 10 000 non-redundant proteins that represent all currently recognized families of transmembrane molecular transport systems. Proteins in TCDB are organized in a five level hierarchical system, where the first two levels are the class and subclass, the second two are the family and subfamily, and the last one is the transport system. Superfamilies that contain multiple families are included as hyperlinks to the five tier TC hierarchy. TCDB includes proteins from all types of living organisms and is the only transporter classification system that is both universal and recognized by the International Union of Biochemistry and Molecular Biology. It has been expanded by manual curation, contains extensive text descriptions providing structural, functional, mechanistic and evolutionary information, is supported by unique software and is interconnected to many other relevant databases. TCDB is of increasing usefulness to the international scientific community and can serve as a model for the expansion of database technologies. This manuscript describes an update of the database descriptions previously featured in NAR database issues.
The major facilitator superfamily (MFS) is the largest known superfamily of secondary carriers found in the biosphere. It is ubiquitously distributed throughout virtually all currently recognized organismal phyla. This superfamily currently (2012) consists of 74 families, each of which is usually concerned with the transport of a certain type of substrate. Many of these families, defined phylogenetically, do not include even a single member that is functionally characterized. In this article, we probe the evolutionary origins of these transporters, providing evidence that they arose from a single 2‐transmembrane segment (TMS) hairpin structure that triplicated to give a 6‐TMS unit that duplicated to a 12‐TMS protein, the most frequent topological type of these permeases. We globally examine MFS protein topologies, focusing on exceptional proteins that deviate from the norm. Nine distantly related families appear to have members with 14 TMSs in which the extra two are usually centrally localized between the two 6‐TMS repeat units. They probably have arisen by intragenic duplication of an adjacent hairpin. This alternative topology probably arose multiple times during MFS evolution. Convincing evidence for MFS permeases with fewer than 12 TMSs was not forthcoming, leading to the suggestion that all 12 TMSs are required for optimal function. Some homologs appear to have 13, 14, 15 or 16 TMSs, and the probable locations of the extra TMSs were identified. A few MFS permeases are fused to other functional domains or are fully duplicated to give 24‐TMS proteins with dual functions. Finally, the MFS families with no known function were subjected to genomic context analyses leading to functional predictions.
The Transporter Classification Database (TCDB; tcdb.org) is a freely accessible reference resource, which provides functional, structural, mechanistic, medical and biotechnological information about transporters from organisms of all types. TCDB is the only transport protein classification database adopted by the International Union of Biochemistry and Molecular Biology (IUBMB) and now (October 1, 2020) consists of 20 653 proteins classified in 15 528 non-redundant transport systems with 1567 tabulated 3D structures, 18 336 reference citations describing 1536 transporter families, of which 26% are members of 82 recognized superfamilies. Overall, this is an increase of over 50% since the last published update of the database in 2016. This comprehensive update of the database contents and features include (i) adoption of a chemical ontology for substrates of transporters, (ii) inclusion of new superfamilies, (iii) a domain-based characterization of transporter families for the identification of new members as well as functional and evolutionary relationships between families, (iv) development of novel software to facilitate curation and use of the database, (v) addition of new subclasses of transport systems including 11 novel types of channels and 3 types of group translocators and (vi) the inclusion of many man-made (artificial) transmembrane pores/channels and carriers.
The Bio-V Suite is a collection of python scripts designed specifically for bioinformatic research regarding transport protein evolution. The Bio-V Suite contains nine powerful programs for Unix-based environments, each of which can be run as a standalone tool or be accessed in a programmatic fashion. These programs and their functions are as follows: TMStats generates topological statistics for transport proteins. GSAT performs shuffle-based binary alignments and is fully scalable. It can cross compare two FASTA files or individual sequences. Protocol1 performs remote PSI-BLAST searches and filters redundant/similar sequences and annotates them. Protocol2 finds homologues between FASTA lists and generates graphical reports. TSSearch uses a rapid search algorithm to find distant homologues in FASTA files in a heuristic manner. SSearch is the exhuastive version of TSSearch. GBlast will identify potential transport proteins in any genome/proteome file, or find similar transport protein homologues between two different genomes/proteomes before generating a graphical report. AncientRep will find putative transmembrane repeat units using a list of homologues. DefineFamily will generate a FASTA list to represent an entire TC family. These nine programs are tabulated with descriptions of their capabilities in Table 1.
The amino acid-polyamine-organocation (APC) superfamily has been shown to include five recognized families, four of which are specific for amino acids and their derivatives. Recent high-resolution X-ray crystallographic data have shown that four additional transporter families (BCCT, TC No. 2.A.15; SSS, 2.A.21; NSS, 2.A.22; and NCS1, 2.A.39), transporting a wide range of solutes, exhibit sufficiently similar folds to suggest a common evolutionary origin. We have used established statistical methods, based on sequence similarity, to show that these families are, in fact, members of the APC superfamily. We also identify two additional families (NCS2, 2.A.40; SulP, 2.A.53) as being members of this superfamily. Repeat sequences, each having five transmembrane α-helical segments and arising via ancient intragenic duplications, are demonstrated for all of these families, further strengthening the conclusion of homology. The APC superfamily appears to be the second largest superfamily of secondary carriers, the largest being the major facilitator superfamily (MFS). Although the topology of the members of the APC superfamily differs from that of the MFS, both families appear to have arisen from a common ancestral 2 TMS hairpin structure that underwent intragenic triplication followed by loss of a TMS in the APC family, to give the repeat units that are characteristic of these two superfamilies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.