Microarray technology provides a powerful tool for the expression profile of thousands of genes simultaneously, which makes it possible to explore the molecular and metabolic etiology of the development of a complex disease under study. However, classical statistical methods and technologies fail to be applicable to microarray data. Therefore, it is necessary and motivating to develop powerful methods for large-scale statistical analyses. In this paper, we described a novel method, called Ranking Analysis of Microarray Data (RAM). RAM, which is a large-scale two-sample t-test method, is based on comparisons between a set of ranked T statistics and a set of ranked Z values (a set of ranked estimated null scores) yielded by a "randomly splitting" approach instead of a "permutation" approach and a two-simulation strategy for estimating the proportion of genes identified by chance, i.e., the false discovery rate (FDR). The results obtained from the simulated and observed microarray data show that RAM is more efficient in identification of genes differentially expressed and estimation of FDR under undesirable conditions such as a large fudge factor, small sample size, or mixture distribution of noises than Significance Analysis of Microarrays.
The goal of linkage mapping is to find the true order of loci from a chromosome. Since the number of possible orders is large even for a modest number of loci, the problem of finding the optimal solution is known as a NP-hard problem or traveling salesman problem (TSP). Although a number of algorithms are available, many either are low in the accuracy of recovering the true order of loci or require tremendous amounts of computational resources, thus making them difficult to use for reconstructing a large-scale map. We developed in this article a novel method called unidirectional growth (UG) to help solve this problem. The UG algorithm sequentially constructs the linkage map on the basis of novel results about additive distance. It not only is fast but also has a very high accuracy in recovering the true order of loci according to our simulation studies. Since the UG method requires n À 1 cycles to estimate the ordering of n loci, it is particularly useful for estimating linkage maps consisting of hundreds or even thousands of linked codominant loci on a chromosome.A LTHOUGH more and more genomes are being sequenced, high-quality linkage maps for many organisms remain useful. For a majority of organisms, obtaining appropriate linkage maps will be a necessary step for understanding their genetic architecture. With the advance of technology as well as increasing demand of high-density linkage maps, the number of loci simultaneously examined in experiments is steadily growing, which presents a considerable computational challenge for estimating the underlying linkage map since the number of possible orders of loci increases rapidly with the number of loci (Olson and Boehnke 1990;Mester et al. 2003). The construction of linkage maps has been recognized as a special case of the traveling salesman problem (TSP) (Liu 1998). The TSP is a classical non-deterministic polynomial time (NP)-complete problem (Wilson 1988;Olson and Boehnke 1990;Falk 1992) that has attracted the attention of mathematicians and computer scientists. Currently, there are two approaches to tackle the problem. One is to find the answer by performing exhaustive searches and the other is to find approximations. Even for just 10 loci on a chromosome, there are 1,814,400 possible orders. Thus it is extremely time consuming (if not entirely impossible) to exhaustively search all possible orders when the number of loci on a chromosome is .30 (Mester et al.
Although most high-density linkage maps have been constructed from codominant markers such as single-nucleotide polymorphisms (SNPs) and microsatellites due to their high linkage information, dominant markers can be expected to be even more significant as proteomic technique becomes widely applicable to generate protein polymorphism data from large samples. However, for dominant markers, two possible linkage phases between a pair of markers complicate the estimation of recombination fractions between markers and consequently the construction of linkage maps. The low linkage information of the repulsion phase and high linkage information of coupling phase have led geneticists to construct two separate but related linkage maps. To circumvent this problem, we proposed a new method for estimating the recombination fraction between markers, which greatly improves the accuracy of estimation through distinction between the coupling phase and the repulsion phase of the linked loci. The results obtained from both real and simulated F 2 dominant marker data indicate that the recombination fractions estimated by the new method contain a large amount of linkage information for constructing a complete linkage map. In addition, the new method is also applicable to data with mixed types of markers (dominant and codominant) with unknown linkage phase.
Our program is implemented in Matlab and is available upon request.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.