Semi-continuous data appear frequently in many scientific fields which is a special type of data which include a number of continuous data with reoccurrences of some discrete numbers. For example, in epidemiology studies, data often include both zeros for those areas without certain disease, and positive values indicating the severity degree of the diagnosed epidemic cases in other places. Such data often include outliers and errors from the experiment and the measurement, which are not preventable.Modeling the semi-continuous data in the presence of noise is challenging because of the appearance of outliers and the skewness from the normal distribution within the data. Both of them may mislead the modeling result. It is imperative to model problems with such data, but available techniques are very limited.This dissertation aims to develop a formal methodology using supervised learning when: (1) relevant information is stored in a series of images; (2) images are usually noisy and distorted from each other; (3) the data set for modeling is a data set with a semi-continuous response variable; and (4) the modeling goal is to understand the causal mechanism between variables and to predict future events accurately. The developed methodology includes three models for this objective: an image fusion algorithm, an outlier detection framework and a two-part generalized hierarchical model for semi-continuous data. They have been applied to a real corrosion problem and modeling results showed that this methodology solved this problem effectively.Corrosion data were efficiently extracted from corrosion images, outliers within the extracted data were detected and treated properly and most importantly, the underlying causal mechanisms between material microstructures and corrosion evolution i ii were revealed by the generalized hierarchical model. Four major contributions have been made: a supervised learning methodology is constructed for problems with information stored in both semi-continuous data and a series of noisy images; an outlier detection framework for supervised learning is constructed to enhance prediction accuracy; an image fusion algorithm is designed to extract and combine information from multiple noisy images, and the estimation of the generalized hierarchical model helps material scientists to reveal the causal mechanisms between grain boundary characteristics and the intergranular corrosion, as well as to predict future corrosion occurrences. Future works of this dissertation are discussed at last.