Quantifying information content in remote-sensing images is fundamental for information-theoretic characterization of remote sensing information processes, with the images being usually information sources. Information-theoretic methods, being complementary to conventional statistical methods, enable images and their derivatives to be described and analyzed in terms of information as defined in information theory rather than data per se. However, accurately quantifying images’ information content is nontrivial, as information redundancy due to spectral and spatial dependence needs to be properly handled. There has been little systematic research on this, hampering wide applications of information theory. This paper seeks to fill this important research niche by proposing a strategy for quantifying information content in multispectral images based on information theory, geostatistics, and image transformations, by which interband spectral dependence, intraband spatial dependence, and additive noise inherent to multispectral images are effectively dealt with. Specifically, to handle spectral dependence, independent component analysis (ICA) is performed to transform a multispectral image into one with statistically independent image bands (not spectral bands of the original image). The ICA-transformed image is further normal-transformed to facilitate computation of information content based on entropy formulas for Gaussian distributions. Normal transform facilitates straightforward incorporation of spatial dependence in entropy computation for the aforementioned double-transformed image bands with inter-pixel spatial correlation modeled via variograms. Experiments were undertaken using Landsat ETM+ and TM image subsets featuring different dominant land cover types (i.e., built-up, agricultural, and hilly). The experimental results confirm that the proposed methods provide more objective estimates of information content than otherwise when spectral dependence, spatial dependence, or non-normality is not accommodated properly. The differences in information content between image subsets obtained with ETM+ and TM were found to be about 3.6 bits/pixel, indicating the former’s greater information content. The proposed methods can be adapted for information-theoretic analyses of remote sensing information processes.