doi:10.1016/j.parco.2004.07.013 www.elsevier.com/locate/parco 1094 N. Jacq et al. / Parallel Computing 30 (2004) 1093–1107 The grid is a promising tool to resolve the crucial issue of software and data integration in biology. In this paper, we have reported on our experience in the deployment of bioinformatic grid applications within the framework of the DataGrid project. These applications inquired the potential impact of grids for CPU demanding algorithms and bioinformatics web portal
Once a new gene has been seunenced. it must be alreadv been resorted. and to estimate the similaritv to known verified whether or not it issimilar to previously sequenced genes. In many cases, the organization that sequenced a potentially novel gene needs to keep the sequence itself in confidence. However, to compare the potentially novel sequence with known sequences, it must either he sent as a qnery to public databases, or these databases must be downloaded onto a local computer. In both cases, the potentially new sequence is exposed to the public In this work, we propose a new method, ealled Interval Sampling, to compare sequences without leaking exact information about the new sequence. Io order to keep the exact sequence information secmt, this method samples intervals (subsequences) from a sequence, and these intervals are hashed. The hashed data are open to the public to verify the novelty of the sequence. We fmd that this method works well in parallel in a distributed computing environment, such as the Grid. The experimental results for 19797 hsapiens genes and 25ooo m.musculus genes show that the parallel implementation of this method performs reasonably well in t e m of speed and memory usage.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.