With the rapid development of location-based service applications in recent years, indoor localization based on Wi-Fi channel state information (CSI) has attracted extensive attention. Although spatial variations are explicitly reflected in CSI measurements, representation differences caused by small contextual changes tend to be submerged by overall multipath fluctuations, especially in devicefree localization. Most existing model-based solutions either underutilize the spatial and time-frequency information carried by CSI, or the employed models fail to capture small representational changes in CSI well, which makes them struggle to get satisfactory expectations. To address these problems, this paper proposes a progressive device-free localization scheme using multidimensional CSI features classification, named ProLoc. Different from previous works that directly extract the features of raw measurements, ProLoc constructs a model input containing multi-category information of CSI to perform better location inference. First, a time-frequency correlation matrix with low rank is created based on the CSI amplitude, and a 3-order CSI tensor is planned combined with spatial diversity. Second, using a novel low-rank matrix factorization algorithm and the time-frequency gradients of CSI, a 4-order tensor containing multidimensional information is generated as input data. Finally, employing 3D convolutional neural networks and gated recurrent unit, we build a progressive dual-model system to realize the mapping from CSI to target locations in various spatial contexts. Extensive self-evaluations and comparisons with several state-of-the-art methods highlight the superiority of the proposed ProLoc.