Sparse representation based dictionary learning has been exploited in solving various image analysis problems -image classification, tracking, quality assessment, de-noising, image reconstruction. The objective of dictionary learning is to obtain an adaptive basis function from the data and simultaneously provide a compact representation. In this work we employ sparse representation based dictionary learning techniques for segmentation, image classification and video analysis problems.In image and video processing applications, one of the major challenges is the choice of appropriate features for image representation. Various techniques exist that employ different analytical methods to extract color, texture and frequency information from images. However, these methods do not identify which of this information are more relevant for a particular image. Neither do these methods have any discrimination power to recognize more informative local image regions.In this work, we first tackle the problem of query specific image feature descriptor selection.Depending on the image content, different features e.g., color texture, structure can prove to be more relevant in representing and discriminating an image. We use a discriminative dictionary learning method in designing a classifier and an information theoretic measure to select the most appropriate feature for an image. This method attempt to identify the feature descriptors that provide more information about an image conditioned on the available images in a class.In image classification, while identifying the relevant feature type is important, it is also iii Abstract crucial to identify the essential contents of an image which discriminate it from the others.While the above mentioned solution is appropriate for determining image specific feature type, it does not incorporate any local image analysis to identify image regions associated with an object. To address this problem, we develop a method that leverages salient object detection framework to learn the dictionary and sparse codes from an image. The method simultaneously detects relevant image regions and computes a compact image representation.We also devise similarity measures exploiting the sparse representations for comparing image pairs. This similarity measure is used in image classification particularly for scenarios where training data is limited. Our method outperformed the state of the art methods by an average of 12% in overall accuracy for histo-pathological tissue image classification Although the above mentioned saliency guided dictionary learning method is applied to image classification, the application is not limited to just object recognition. The method is hence exploited for event detection from video. The saliency based dictionary learning and the similarity measure is used first for a frame by frame analysis to identify the temporal occurrence of an event. To make the system more robust to occlusion, dynamic background, we further employ a spatio-temporal saliency driven low rank ...