With the development of multimedia technology and computer network, the number of available images increases with an explosive speed. But the technology also brings some trouble to its users, sometimes it's very difficult for us to find some details that very important from a huge amount of available data. At this time, image scene and emotion categorization technologies are required urgently. The purpose of emotional image classification is that we hope the computer can express the emotion reaction when observing the image, and classify the images into the different emotional categories automatically. The process of scene image classification is that how to make computer systems to classify the image sets automatically which contain semantic information, according to the visual perception mechanism of human. Here, for the scene categorization problem based on the visual words, the dissertation presents a novel learning framework to design discriminating semantic visual words. At last, for the emotion categorization of natural scene images, the dissertation presents an emotion categorization using Affective-probabilistic Latent Semantic Analysis model based on the visual cognitive theory.