It is known that two proteins of the cellulosomal complex of Clostridium thermocellum (SL and SS) together degrade crystalline cellulose. SL is a glycoprotein of 210,000 Da which enhances the binding to cellulose and the activity of SS, an endoglucanase of 83,000 Da. We have previously reported the cloning of a DNA fragment encoding the N-terminal end of the SL protein using antibodies raised against the native protein. A chromosomal walking approach using an EcoRI and a Bam HI-Sau3A gene library allowed us to isolate the C-terminal end of the gene. Sequencing of both fragments revealed the existence of a leader peptide as has been found in cellulases of the same organism. This leader sequence is followed by a stretch of 14 amino acids that is identical to the N-terminal amino acid sequence of the native secreted protein. The open reading frame (ORF) of this gene encodes a protein of 196,800 Da and is followed by a hairpin loop that could be involved in transcription termination. Within the open reading frame (ORF), we found nine internal repeated elements (IREs) of about 500 nucleotides each. Seven of these sequences displayed 98-100% homology and were located adjacent to each other within the structural gene without intervening regions. The remaining two, located on the N-terminal end of the gene, showed a significantly lower homology. Bearing in mind the inherent instability of reiterated regions, we confirmed the authenticity of our clones by Southern blot analysis using chromosomal C. thermocellum DNA and ruled out the possibility of rearrangements during the cloning and sequencing process. The sequenced gene is designated cipA and the encoded SL protein CipA.
Two independent collections of clones containing Clostridium thermocellum genes involved in cellulose have been previously obtained at IAPGR, Cambridge, and at the Pasteur Institute, Paris. The two collections were compared for cross‐hybridization, restriction maps and enzyme phenotypes. Truly distinct genes were one β‐glucosidase gene, two xylanase genes, and fifteen endogluconase genes. Two of the cloned fragments contained extraneous DNA which was absent from their respective counterparts isolated in the other collection. The dicrepancies resulted from in vivo rearrangements which had occurred in either of the C. thermocellum NCIB 10682 stocks used to generate the two gene banks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.