We introduce novel profile-based string kernels for use with support vector machines (SVMs) for the problems of protein classification and remote homology detection. These kernels use probabilistic profiles, such as those produced by the PSI-BLAST algorithm, to define position-dependent mutation neighborhoods along protein sequences for inexact matching of k-length subsequences ("k-mers") in the data. By use of an efficient data structure, the kernels are fast to compute once the profiles have been obtained. For example, the time needed to run PSI-BLAST in order to build the profiles is significantly longer than both the kernel computation time and the SVM training time. We present remote homology detection experiments based on the SCOP database where we show that profile-based string kernels used with SVM classifiers strongly outperform all recently presented supervised SVM methods. We further examine how to incorporate predicted secondary structure information into the profile kernel to obtain a small but significant performance improvement. We also show how we can use the learned SVM classifier to extract "discriminative sequence motifs"--short regions of the original profile that contribute almost all the weight of the SVM classification score--and show that these discriminative motifs correspond to meaningful structural features in the protein data. The use of PSI-BLAST profiles can be seen as a semi-supervised learning technique, since PSI-BLAST leverages unlabeled data from a large sequence database to build more informative profiles. Recently presented "cluster kernels" give general semi-supervised methods for improving SVM protein classification performance. We show that our profile kernel results also outperform cluster kernels while providing much better scalability to large datasets.
Abstract-This paper presents a hierarchical watermarking framework for semiregular meshes. Three blind watermarks are inserted in a semiregular mesh with different purposes: a geometrically robust watermark for copyright protection, a high-capacity watermark for carrying a large amount of auxiliary information, and a fragile watermark for content authentication. The proposed framework is based on wavelet transform of the semiregular mesh. More precisely, the three watermarks are inserted in different appropriate resolution levels obtained by wavelet decomposition of the mesh: the robust watermark is inserted by modifying the norms of the wavelet coefficient vectors associated with the lowest resolution level; the fragile watermark is embedded in the high resolution level obtained just after one wavelet decomposition by modifying the orientations and norms of the wavelet coefficient vectors; the high-capacity watermark is inserted in one or several intermediate levels by considering groups of wavelet coefficient vector norms as watermarking primitives. Experimental results demonstrate the effectiveness of the proposed framework: the robust watermark is able to resist all common geometric attacks even with a relatively strong amplitude; the fragile watermark is robust to content-preserving operations, while being sensitive to other attacks of which it can also provide the precise location; the payload of the high-capacity watermark increases rapidly along with the number of watermarking primitives.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.