Vegetation variable retrieval from reflectance data is typically grouped into three categories: the statistical-empirical category, the physical category and the hybrid category (physical models applied to statistical models). Based on the similarities between the spectra of leaves in the optical domain, the leaf reflectance spectra can be linearly modelled using a very limited number of principal components (PCs) if the PCA (principal component analysis) transformation is carried out at the sample dimension. In this paper, we present a novel data-driven approach that uses the PCA transformation to reconstruct leaf reflectance spectra and also to retrieve leaf biochemical contents. First, the PCA transformation was carried out on a training dataset simulated by the PROSPECT-5 model. The results showed that the leaf reflectance spectra can be accurately reconstructed using only a few leading PCs, as the ten leading PCs contained 99.999% of the total information in the 3636 training samples. The spectral error between the simulated or measured reflectance and the reconstructed spectra was also investigated using the simulated and measured datasets (ANGERS and LOPEX'93). The mean root mean squared error (RMSE) values varied from 5.56 Ă 10 â5 to 6.18 Ă 10 â3 , which is about 3-10 times more accurate than the PROSPECT simulation method for measured datasets. Secondly, the relationship between PCs and leaf biochemical components was investigated, and we found that the PCs are closely related to the leaf biochemical components and to the reflectance spectra. Only when the weighting coefficient of the most sensitive PC was employed to retrieve the leaf biochemical contents, the coefficients of determination for the PCA data-driven model were 0.69, 0.99, 0.94 and 0.68 for the specific leaf weight (SLW), equivalent water thickness (EWT), chlorophyll content (Cab) and carotenoid content (Car), respectively. Finally, statistical models for the retrieval of leaf biochemical contents were developed based on the weighting coefficients of the sensitive PCs, and the PCA data-driven models were validated and compared to the traditional VI-based and physically-based approaches for the retrieval of leaf properties. The results show that the PCA method shows similar or better performance in the estimation of leaf biochemical contents. Therefore, the PCA method provides a new and accurate data-driven method for reconstructing leaf reflectance spectra and also for retrieving leaf biochemical contents.