Clustering suffers from the curse of dimensionality, and similarity functions that use all input features with equal relevance may not be effective. We introduce an algorithm that discovers clusters in subspaces spanned by different combinations of dimensions via local weightings of features. This approach avoids the risk of loss of information encountered in global dimensionality reduction techniques, and does not assume any data distribution model. Our method associates to each cluster a weight vector, whose values capture the relevance of features within the corresponding cluster. We experimentally demonstrate the gain in perfomance our method achieves with respect to competitive methods, using both synthetic and real datasets. In particular, our results show the feasibility of the proposed technique to perform simultaneous clustering of genes and conditions in gene expression data, and clustering of very high-dimensional data such as text data.
Entropy-type measures for the heterogeneity of clusters have been used for a long time. This paper studies the entropy-based criterion in clustering categorical data. It first shows that the entropy-based criterion can be derived in the formal framework of probabilistic clustering models and establishes the connection between the criterion and the approach based on dissimilarity coefficients. An iterative Monte-Carlo procedure is then presented to search for the partitions minimizing the criterion. Experiments are conducted to show the effectiveness of the proposed procedure.
Hydroxyproline-O-galactosyltransferase (GALT) initiates O-glycosylation of arabinogalactan-proteins (AGPs). We previously characterized GALT2 (At4g21060), and now report on functional characterization of GALT5 (At1g74800). GALT5 was identified using heterologous expression in Pichia and an in vitro GALT assay. Product characterization showed GALT5 specifically adds galactose to hydroxyproline in AGP protein backbones. Functions of GALT2 and GALT5 were elucidated by phenotypic analysis of single and double mutant plants. Allelic galt5 and galt2 mutants, and particularly galt2 galt5 double mutants, demonstrated lower GALT activities and reductions in β-Yariv-precipitated AGPs compared to wild type. Mutant plants showed pleiotropic growth and development phenotypes (defects in root hair growth, root elongation, pollen tube growth, flowering time, leaf development, silique length, and inflorescence growth), which were most severe in the double mutants. Conditional mutant phenotypes were also observed, including salt-hypersensitive root growth and root tip swelling as well as reduced inhibition of pollen tube growth and root growth in response to β-Yariv reagent. These mutants also phenocopy mutants for an AGP, SOS5, and two cell wall receptor-like kinases, FEI1 and FEI2, which exist in a genetic signaling pathway. In summary, GALT5 and GALT2 function as redundant GALTs that control AGP O-glycosylation, which is essential for normal growth and development.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.