Abstract.A methodology to analyze the properties of the first (largest) eigenvalue and its eigenvector is developed for large symmetric random sparse matrices utilizing the cavity method of statistical mechanics. Under a tree approximation, which is plausible for infinitely large systems, in conjunction with the introduction of a Lagrange multiplier for constraining the length of the eigenvector, the eigenvalue problem is reduced to a bunch of optimization problems of a quadratic function of a single variable, and the coefficients of the first and the second order terms of the functions act as cavity fields that are handled in cavity analysis. We show that the first eigenvalue is determined in such a way that the distribution of the cavity fields has a finite value for the second order moment with respect to the cavity fields of the first order coefficient. The validity and utility of the developed methodology are examined by applying it to two analytically solvable and one simple but non-trivial examples in conjunction with numerical justification.
Scalability is a key requirement for any KDD and data mining algorithm, and one of the biggest research challenges is to develop methods that allow to use large amounts of data. One possible approach for dealing with huge amounts of data is to take a random sample and do data mining on it, since for many data mining applications approximate answers are acceptable. However, as argued by several researchers, random sampling is difficult to use due to the difficulty of determining an appropriate sample size. In this paper, we take a sequential sampling approach for solving this difficulty, and propose an adaptive sampling algorithm that solves a general problem covering many problems arising in applications of discovery science. The algorithm obtains examples sequentially in an on-line fashion, and it determines from the obtained examples whether it has already seen a large enough number of examples. Thus, sample size is not fixed a priori; instead, it adaptively depends on the situation. Due to this adaptiveness, if we are not in a worst case situation as fortunately happens in many practical applications, then we can solve the problem with a number of examples much smaller than the required in the worst case. For illustrating the generality of our approach, we also describe how different instantiations of it can be applied to scale up knowledge discovery problems that appear in several areas.
The present study compared the abilities of the spectral vegetation indices (VI) of Advanced Very High Resolution Radiometer (AVHRR) and Moderate Resolution Imaging Spectroradiometer (MODIS) sensors in accurately detecting seasonal vegetation changes (phenology) with regard to forage quantity and quality. The normalized difference vegetation index (NDVI) and enhanced vegetation index (EVI) were computed with a 10-day maximum value composite from April 1 to October 31, 2002. The study sites included four meadow steppes and six typical steppes in the Xilingol steppe area of central Inner Mongolia, China. Comparisons of the MODIS-NDVI and AVHRR-NDVI profiles revealed that the MODIS-NDVI temporal profile had a higher fidelity. The dynamic range of the MODIS-NDVI was then analyzed and its sensitivity in discriminating between vegetation differences was evaluated in sparsely and densely vegetated areas. Estimations of the live, dead standing, total biomass and crude protein (CP) concentration and standing CP were obtained using AVHRR-NDVI (1.1 km pixels), MODIS-NDVI and -EVI (500 m pixels). Regression analysis revealed that the MODIS-VI showed a good coefficient of determination ( R 2 = 0.77-0.83) with regard to estimations of the total and live biomass. Furthermore, the MODIS-EVI was a good predictor of standing CP ( R 2 = 0.74) compared with AVHRR ( R 2 = 0.53). These results suggest that the MODIS-VI can reliably detect the phenology and forage quantity and quality of grassland steppe areas.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.