In this paper, we consider the problem of discovering interesting substructures from a large collection of semi-structured data in the framework of optimized pattern discovery. We model semi-structured data and patterns with labeled ordered trees, and present an efficient algorithm that discovers the best labeled ordered trees that optimize a given statistical measure, such as the information entropy and the classification accuracy, in a collection of semi-structured data. We give theoretical analyses of the computational complexity of the algorithm for patterns with bounded and unbounded size. Experiments show that the algorithm performs well and discovered interesting patterns on real datasets.
Accumulation and recovery of radiation-induced damage with swift heavy ions in stoichiometric magnesium aluminate spinel, MgAl2O4, has been investigated. Microstructural change and atomic disordering was examined through transmission electron microscopy (TEM) techniques, with bright-field (BF) and high-resolution (HR) TEM images, and high angular resolution electron channelling X-ray spectroscopy (HARECXS), for single crystal MgAl2O4 irradiated with 200 MeV Xe, and 340 or 350 MeV Au ions. The density of core damage region, detected by BFTEM with Fresnel-contrast, increased proportionally with ion fluence at the early stage of accumulation and saturated at a fluence higher than 1016 ions m2. This result is discussed with a balance between the formation and recovery of the core damage region under irradiation, and the influence region to induce the recovery was evaluated to be 7 – 9 nm in radius. HARECXS and electron diffraction analysis revealed that cations at tetrahedral sites preferentially occupy octahedral sites to transform to defective rock-salt structure. The structure of the core damage region is found from HR and BFTEM images to be a columnar vacancy-rich region with a low atomic density.
In this paper. we study an online dara mining pmblem fmm srreums of semi-srrucrured dara such as XML dara.Modeling semi-strucrured dara and patrerns as labeled ordered trees. we present an online algorithm StrearnT rhar receives fragmenrs of an unseen possibly infinire semistructured dara in rhe documenr order through a data stream, and can reruin the current ser of frequent parrerns immediately on requesr ar any time. A crucial parr of our algorithm is the incremental maintenance of rhe occurrences of possibly frequent parterns using a tree sweeping rechnigue. We give modifications of rhe algorithm IO other online mining model. We present theorerical and empirical analyses ro evaluate the performance of the algorithm.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.