The fecal contamination of water resources is the main cause of enteric waterborne diseases all over the world. Traditional indicator methods used in the water microbiological quality assessment are not able to identify fecal contamination source. This work intended to prospect molecular markers in hosts and track them in water samples to identify pollution sources in surface waters in the São Paulo State, Brazil. Two library-dependent methods with E. coli strains isolated from different hosts and water samples were used, a genotypic typing method (E. coli phylogenetic groups) and a phenotypic typing method (MALDI-TOF/MS). A library-independent method using 454 pyrosequencing of hypervariable16S rRNA gene V3 region was used in DNA from feces and water samples. Phylogenetic groups were used as a tool in host classification and correspondence analysis showed feeding habits clusters. The classification of environmental samples revealed higher frequencies of subgroups A 1 and B2 3 in rivers impacted by human pollution sources, while subgroups D 1 and D 2 were associated with pristine sites, and subgroup B1 with domestic animal sources, indicating their use as a first screening for pollution source identification. A simple classification is proposed based on phylogenetic subgroup distribution using the w-clique metric, enabling differentiation of polluted and unpolluted sites.Protein profiles of E. coli strains isolated from host and water samples were analyzed by MALDI-TOF/MS. Specific host biomarkers were identified and their use was indicated as a potential tool for the source tracking. Validation with E. coli strains isolated from rivers and reservoirs showed that water samples presented markers from different hosts, suggesting these rivers have mixed sources of fecal contamination. Sequencing of the 16S rRNA V3 region in stool samples (human and bovine) and water showed 4296 operational taxonomic units (OTUs). The greatest diversity was observed in samples of cattle feces and the smallest one in the pristine water sample.Firmicutes was the predominant group in samples of human feces, while in the most common bovine feces are the Firmicutes and Bacteroidetes. The interaction network showed that the stool samples had the greatest diversity and, among them, the water sample with human pollution source showed the highest diversity. The LEfSe method was used to identify host biomarkers. As human biomarkers, Actinobacteria, Betaproteobacteria and Firmicutes were identified and for cattle the potential markers are Bacteroidetes, Tenericutes and Spirochaetes.Host-specific markers were identified, but they were not found in water samples suggesting that the used tools either do not have the resolution to identify markers in environmental samples or contamination in water bodies is mixed. Additionally, as the host-specific markers were isolated x from non-autochthonous micro-organisms, they could be affected by the environmental adverse effects such as physical-chemical factors and competition with native organisms. xi SUM...