The m-adic residue codes are a generalization of the quadratic residue codes. They are cyclic codes which exist at prime lengths p over GF(q) when m I(pl), ( q , p ) = 1, and q is an m-adic residue modulo p . The m-adic residue codes are investigated and are found to have many of the strong properties of the quadratic residue codes. A subgroup of the automorphism group and restrictions on the form of the idempotents of the m-adic residue codes are given. It is shown that some m-adic residue codes are self-orthogonal and the duals of some m-adic residue codes are their complements. Bounds on the minimum of the weights of the odd-like vectors in the odd-like codes are given. (For binary codes, this is the minimum odd weight.) At some lengths, m-adic residue codes exist for several values of m. Containment relationships between these codes are demonstrated which show that when m is even, m-adic residue codes inherit properties of quadratic residue codes. A table is included that contains minimum weights of the binary m-adic residue codes of lengths less than or equal to 127. Many of these codes have the highest possible minimum weight known for codes of their lengths and dimensions. Index Terms-Cyclic codes, quadratic residue codes, m-adic residue codes.
Correlation metrics are widely utilized in genomics analysis and often implemented with little regard to assumptions of normality, homoscedasticity, and independence of values. This is especially true when comparing values between replicated sequencing experiments that probe chromatin accessibility, such as assays for transposase-accessible chromatin via sequencing (ATAC-seq). Such data can possess several regions across the human genome with little to no sequencing depth and are thus non-normal with a large portion of zero values. Despite distributed use in the epigenomics field, few studies have evaluated and benchmarked how correlation and association statistics behave across ATAC-seq experiments with known differences or the effects of removing specific outliers from the data. Here, we developed a computational simulation of ATAC-seq data to elucidate the behavior of correlation statistics and to compare their accuracy under set conditions of reproducibility. Using these simulations, we monitored the behavior of several correlation statistics, including the Pearson'sRand Spearman's ρ coefficients as well as Kendall's τ and Top-Down correlation. We also test the behavior of association measures, including the coefficient of determinationR2, Kendall's W, and normalized mutual information. Our experiments reveal an insensitivity of most statistics, including Spearman's ρ, Kendall's τ, and Kendall's W, to increasing differences between simulated ATAC-seq replicates. The removal of co-zeros (regions lacking mapped sequenced reads) between simulated experiments greatly improves the estimates of correlation and association. After removing co-zeros, theR2coefficient and normalized mutual information display the best performance, having a closer one-to-one relationship with the known portion of shared, enhanced loci between simulated replicates. When comparing values between experimental ATAC-seq data using a random forest model, mutual information best predicts ATAC-seq replicate relationships. Collectively, this study demonstrates how measures of correlation and association can behave in epigenomics experiments and provides improved strategies for quantifying relationships in these increasingly prevalent and important chromatin accessibility assays.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.