The Clostridium thermocellum F1 celJ gene, encoding endoglucanase J (CelJ), consists of an open reading frame (ORF) of 4,803 nucleotides and encodes a protein of 1,601 amino acids with a molecular weight of 178,055. The ORF was confirmed as celJ by comparison with the N-terminal sequence of a truncated CelJ derivative. CelJ is a modular enzyme composed of N-terminal signal peptide and six domains in the following order: an S-layer homology domain, a domain of unknown function (UD-1), a subfamily E1 endoglucanase domain, a family J endoglucanase domain, a docking domain, and another domain of unknown function (UD-2). UD-1 has no significant similarity to UD-2. CelJ hydrolyzed carboxymethylcellulose and xylan, and xylanase activity was ascribed to the family J domain. Antiserum raised against the truncated CelJ crossreacted with proteins contained in the cellulosome of C. thermocellum F1. These results strongly suggest that CelJ is equivalent to S 2 , which was identified as the largest catalytic component in the cellulosome of C. thermocellum YS. A second but incomplete ORF encoding an enzyme classified in subfamily E2 endoglucanase, was located downstream of celJ.