In this paper, a weakly supervised domain generalization (WSDG) method is proposed for real-world visual recognition tasks, in which we train classifiers by using Web data (e.g., Web images and Web videos) with noisy labels. In particular, two challenging problems need to be solved when learning robust classifiers, in which the first issue is to cope with the label noise of training Web data from the source domain, while the second issue is to enhance the generalization capability of learned classifiers to an arbitrary target domain. In order to handle the first problem, the training samples within each category are partitioned into clusters, where we use one bag to denote each cluster and instances to denote the samples in each cluster. Then, we identify a proportion of good training samples in each bag and train robust classifiers by using the good training samples, which leads to a multi-instance learning (MIL) problem. In order to handle the second problem, we assume that the training samples possibly form a set of hidden domains, with each hidden domain associated with a distinctive data distribution. Then, for each category and each hidden latent domain, we propose to learn one classifier by extending our MIL formulation, which leads to our WSDG approach. In the testing stage, our approach can obtain better generalization capability by effectively integrating multiple classifiers from different latent domains in each category. Moreover, our WSDG approach is further extended to utilize additional textual descriptions associated with Web data as privileged information (PI), although testing data do not have such PI. Extensive experiments on three benchmark data sets indicate that our newly proposed methods are effective for real-world visual recognition tasks by learning from Web data.