Motivation: In contrast to RNA-seq analysis, which has various standard methods, no standard methods for identifying differentially methylated cytosines (DMCs) exist. To identify DMCs, we tested principal component analysis and tensor decomposition based on unsupervised feature extraction with optimized standard deviation whose effectiveness toward differentially expressed gene (DEG) identification was recently recognized.
Results: The proposed methods can outperform some conventional methods, including those that must assume beta-binomial distribution for methylation that the proposed methods do not have to assume especially when applied to methylation profiles measured using high throughput sequencing. DMCs identified by the proposed method are also significantly overlapped with various functional sites, including known differentially methylated regions, enhancers, and DNase I hypersensitive sites. This suggests that the proposed method is a promising candidate for standard methods for identifying DMCs. Availability: Sample R source code is available at https://github.com/tagtag/PCAUFEOPSD