Abstract. The bottom-up saliency, an early stage of humans' visual attention, can be considered as a binary classification problem between center and surround classes. Discriminant power of features for the classification is measured as mutual information between features and two classes distribution. The estimated discrepancy of two feature classes very much depends on considered scale levels; then, multi-scale structure and discriminant power are integrated by employing discrete wavelet features and Hidden markov tree (HMT). With wavelet coefficients and Hidden Markov Tree parameters, quad-tree like label structures are constructed and utilized in maximum a posterior probability (MAP) of hidden class variables at corresponding dyadic sub-squares. Then, saliency value for each dyadic square at each scale level is computed with discriminant power principle and the MAP. Finally, across multiple scales is integrated the final saliency map by an information maximization rule. Both standard quantitative tools such as NSS, LCC, AUC and qualitative assessments are used for evaluating the proposed multiscale discriminant saliency method (MDIS) against the well-know information-based saliency method AIM on its Bruce Database wity eye-tracking data. Simulation results are presented and analyzed to verify the validity of MDIS as well as point out its disadvantages for further research direction.
Visual Attention -Computational ApproachVisual attention is a psychological phenomenon in which human visual systems are optimized for capturing scenic information. Robustness and efficiency of biological devices, the eyes and their control systems, visual paths in the brain have amazed scientists and engineers for centuries. From Neisser [26] to Marr [25], researchers have put intesive effort in discovering attention principles and engineering artificial systems with equivalent capability. For last two decades, this research field is dominated by visual saliency principles which proposes an existence of a saliency map for attention guidance. The idea is further promoted in Feature Integration Theory (FIT) [34] which elaborates computational principles of saliency map generation with center-surround operators and basic image features such as intensity, orientation and colors. Then, Itti et al. [21] implemented and released the first complete computer algorithms of FIT theory 5 . Feature Integration Theory are widely accepted as principles behind visual attention partly due to its utilization of basic image features. Moreover, this hypothesis is supported by several evidences from psychological experiments. However, it only defines theoretical aspects of visual attention with saliency maps, but does not investigate how such principles would be implemented algorithmically. [20], or saliency at each location depends on statistical modeling of the local feature distribution [32]. Though many approaches are mentioned in long and rich literature of visual saliency, only a few are built on a solid theory or linked to other well-establis...