With the increasing amount of available genome sequences, novel tools are needed for comprehensive analysis of species-specific sequence characteristics for a wide variety of genomes. We used an unsupervised neural network algorithm, a self-organizing map (SOM), to analyze di-, tri-, and tetranucleotide frequencies in a wide variety of prokaryotic and eukaryotic genomes. The SOM, which can cluster complex data efficiently, was shown to be an excellent tool for analyzing global characteristics of genome sequences and for revealing key combinations of oligonucleotides representing individual genomes. From analysis of 1-and 10-kb genomic sequences derived from 65 bacteria (a total of 170 Mb) and from 6 eukaryotes (460 Mb), clear species-specific separations of major portions of the sequences were obtained with the di-, tri-, and tetranucleotide SOMs. The unsupervised algorithm could recognize, in most 10-kb sequences, the species-specific characteristics (key combinations of oligonucleotide frequencies) that are signature features of each genome. We were able to classify DNA sequences within one and between many species into subgroups that corresponded generally to biological categories. Because the classification power is very high, the SOM is an efficient and fundamental bioinformatic strategy for extracting a wide range of genomic information from a vast amount of sequences.[Supplemental material is available online at www.genome.org.]In addition to protein-coding information, genome sequences contain a wealth of information of interest in many fields of biology, from molecular evolution to genome engineering. G+C% has been used as a fundamental characteristic of individual genomes, but the G+C% is apparently too simple a parameter to differentiate a wide variety of genomes of known sequences. Oligonucleotide frequency can be used to distinguish genomes, because oligonucleotide frequencies vary significantly among genomes; dinucleotide frequencies, for example, are shown to be genome signatures for both prokaryotes and eukaryotes (Nussinov 1984;Karlin et al. 1997;Karlin 1998;Gentles and Karlin 2001). Comprehensive analyses of oligonucleotide frequencies in a wide variety of genomes are thought to provide fundamental knowledge of individual genomes, namely, key combinations of oligonucleotides responsible for the biological properties of the different genomes and genome portions. We applied Kohonen's self-organizing map (SOM) to create graphical representations of oligonucleotide frequencies from which we could extract a wide range of genomic information. The unsupervised neural network algorithm is an effective tool for clustering and visualizing high-dimensional data; it converts complex nonlinear relations among high-dimensional data into simple geometric relations that can be viewed in two dimensions (Kohonen 1982(Kohonen , 1990Kohonen et al. 1996).We and others have used SOMs to characterize codon usage patterns of a wide variety of bacteria (Kanaya et al. 1998;Wang et al. 2001). We introduced a new feature ...
5-Fluorouracil (5-FU) is a key anticancer drug that for its broad antitumor activity, as well as for its synergism with other anticancer drugs, has been used to treat various types of malignancies. In chemotherapeutic regimens, 5-FU has been combined with oxaliplatin, irinotecan and other drugs as a continuous intravenous infusion. Recent clinical chemotherapy studies have shown that several of the regimens with oral 5-FU drugs are not inferior compared to those involving continuous 5-FU infusion chemotherapy, and it is probable that in some regimens continuous 5-FU infusion can be replaced by oral 5-FU drugs. Historically, both the pharmaceutical industry and academia in Japan have been involved in the development of oral 5-FU drugs, and this review will focus on the current knowledge of 5-FU anabolism and catabolism, and the available information about the various orally-administrable 5-FU drugs, including UFT, S-1 and capecitabine. Clinical studies comparing the efficacy and adverse events of S-1 and capecitabine have been reported, and the accumulated results should be utilized to optimize the treatment of cancer patients. On the other hand, it is essential to elucidate the pharmacokinetic mechanism of each of the newly-developed drugs, to correctly select the drugs for each patient in the clinical setting, and to further develop optimized drug derivatives.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.