Fecal contamination is one of the factors causing deterioration of Laguna Lake. Although total coliform levels are constantly monitored, no protocol is in place to identify their origin. This can be addressed using the library-dependent microbial source tracking (MST) method, repetitive element sequence-based polymerase chain reaction (rep-PCR) fingerprinting. Serving as a prerequisite in developing the host-origin library, we assessed the discriminatory power of three fingerprinting primers, namely BOX-A1R, (GTG)5, and REP1R-1/2-1. Fingerprint profiles were obtained from 290 thermotolerant Escherichia coli isolated from sewage waters and fecal samples of cows, chickens, and pigs from regions surrounding the lake. Band patterns were converted into binary profiles and were classified using the discriminant analysis of principal components. Results show that: (1) REP1R-1/2-1 has a low genotyping success rate and information content; (2) increasing the library size led to more precise estimates of library accuracy; and (3) combining fingerprint profiles from BOX-A1R and (GTG)5 revealed the best discrimination (average rate of correct classification (ARCC) = 0.82 ± 0.06) in a two-way categorical split; while (4) no significant difference was found between the combined profiles (0.74 ± 0.15) and using solely BOX-A1R (0.76 ± 0.09) in a four-way split. Testing the library by identifying known isolates from a separate dataset has shown that a two-way classification performed better (ARCC = 0.66) than a four-way split (ARCC = 0.29). The library can be developed further by adding more representative isolates per host source. Nevertheless, our results have shown that combining profiles from BOX-A1R and (GTG)5 is recommended in developing the MST library for Laguna Lake.
Laguna Lake is an economically important resource in the Philippines, with reports of declining water quality due to fecal pollution. Currently, monitoring methods rely on counting fecal indicator bacteria, which does not supply information on potential sources of contamination. In this study, we predicted sources of Escherichia coli in lake stations and tributaries by establishing a fecal source library composed of rep-PCR DNA fingerprints of human, cattle, swine, poultry, and sewage samples (n = 1,408). We also evaluated three statistical methods for predicting fecal contamination sources in surface waters. Random forest (RF) outperformed k-nearest neighbors and discriminant analysis of principal components in terms of average rates of correct classification in two- (84.85%), three- (82.45%), and five-way (74.77%) categorical splits. Overall, RF exhibited the most balanced prediction, which is crucial for disproportionate libraries. Source tracking of environmental isolates (n = 332) revealed the dominance of sewage (47.59%) followed by human sources (29.22%), poultry (12.65%), swine (7.23%), and cattle (3.31%) using RF. This study demonstrates the promising utility of a library-dependent method in augmenting current monitoring systems for source attribution of fecal contamination in Laguna Lake. This is also the first known report of microbial source tracking using rep-PCR conducted in surface waters of the Laguna Lake watershed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.