There are plenty of high-and low-altitude earth observation satellites asynchronously capture massive-scale aerial images everyday. In practice, high-altitude satellites take low-resolution (LR) aerial pictures, each covers a considerably large area. Comparatively, low-altitude satellites capture high-resolution (HR) aerial photos, each depicts a relatively small area. Effectively identifying the LR aerial images' semantic categories is an indispensable module in many AI systems. However, it is also a challenging task due to: 1) the inefficiency to label adequate training samples, and 2) the difficulty to describe how humans preserving the world. To handle these problems, this work presents a so-called active perception learning coupled with a manifold-regularized feature selection (MRFS), aiming at acquiring perceptual and discriminative visual representation to classify LR aerial photos. Particularly, by stimulating how humans sequentially perceiving different salient regions, we deploy an active learning paradigm to divide an LR aerial image into a few attractive regions as well as a rich set of non-attractive regions. Theoretically, the active learning technique ensures that the selected attractive regions can maximally reconstruct each LR aerial image, which well mimics human visual perception. Subsequently, a novel MRFS is designed to select high quality features from the actively detected attractive regions. MRFS has many advantages: 1) it is built upon a semi-supervised architecture that a small proportion of labeled samples are required; 2) a linear classifier based on the selected features can be learned in the unified MRFS framework; 3) the labeled/unlabeled sample distribution are optimally preserved during feature selection (FS). Plenty of empirical results shown the superiority of the proposed classification pipeline.