Imagine you are traveling to Columbia, MO for the first time. On your flight to Columbia, the woman sitting next to you recommended a bakery by a large park with a big yellow umbrella outside. After you land, you need directions to the hotel from the airport. Suppose you are driving a rental car, you will need to park your car at a parking lot or a parking structure. After a good night's sleep in the hotel, you may decide to go for a run in the morning on the closest trail and stop by that recommended bakery under a big yellow umbrella. It would be helpful in the course of completing all these tasks to accurately distinguish the proper car route and walking trail, find a parking lot, and pinpoint the yellow umbrella. Satellite imagery and other geo-tagged data such as Open Street Maps provide effective information for this goal. Open Street Maps can provide road information and suggest bakery within a five-mile radius. The yellow umbrella is a distinctive color and, perhaps, is made of a distinctive material that can be identified from a hyperspectral camera. Open Street Maps polygons are tagged with information such as "parking lot" and "sidewalk." All these information can and should be fused to help identify and offer better guidance on the tasks you are completing. Supervised learning methods generally require precise labels for each training data point. It is hard (and probably at an extra cost) to manually go through and label each pixel in the training imagery. GPS coordinates cannot always be fully trusted as a GPS device may only be accurate to the level of several pixels. In many cases, it is practically infeasible to obtain accurate pixel-level training labels to perform fusion for all the imagery and maps available. Besides, the training data may come in a variety of data types, such as imagery or as a 3D point cloud. The imagery may have different resolutions, scales and, even, coordinate systems. Previous fusion methods are generally only limited to data mapped to the same pixel grid, with accurate labels. Furthermore, most fusion methods are restricted to only two sources, even if certain methods, such as pan-sharpening, can deal with different geo-spatial types or data of different resolution. It is, therefore, necessary and important, to come up with a way to perform fusion on multiple sources of imagery and map data, possibly with different resolutions and of different geo-spatial types with consideration of uncertain labels. I propose a Multiple Instance Choquet Integral framework for multi-resolution multisensor fusion with uncertain training labels. The Multiple Instance Choquet Integral (MICI) framework addresses uncertain training labels and performs both classification and regression. Three classifier fusion models, i.e. the noisy-or, min-max, and generalized-mean models, are derived under MICI. The Multi-Resolution Multiple Instance Choquet Integral (MR-MICI) framework is built upon the MICI framework and further addresses multiresolution in the fusion sources in addition to the uncertainty in training labels. For both MICI and MR-MICI, a monotonic normalized fuzzy measure is learned to be used with the Choquet integral to perform two-class classifier fusion given bag-level training labels. An optimization scheme based on the evolutionary algorithm is used to optimize the models proposed. For regression problems where the desired prediction is real-valued, the primary instance assumption is adopted. The algorithms are applied to target detection, regression and scene understanding applications. Experiments are conducted on the fusion of remote sensing data (hyperspectral and LiDAR) over the campus of University of Southern Mississippi - Gulfpark. Clothpanel sub-pixel and super-pixel targets were placed on campus with varying levels of occlusion and the proposed algorithms can successfully detect the targets in the scene. A semi-supervised approach is developed to automatically generate training labels based on data from Google Maps, Google Earth and Open Street Map. Based on such training labels with uncertainty, the proposed algorithms can also identify materials on campus for scene understanding, such as road, buildings, sidewalks, etc. In addition, the algorithms are used for weed detection and real-valued crop yield prediction experiments based on remote sensing data that can provide information for agricultural applications.