It has been nearly 40 y since it was suggested that genomic methylation patterns could be transmitted via maintenance methylation during S phase and might play a role in the dynamic regulation of gene expression during development [Holliday R, Pugh JE (1975) Science 187(4173):226-232; Riggs AD (1975) Cytogenet Cell Genet 14(1):9-25]. This revolutionary proposal was justified by "... our almost complete ignorance of the mechanism for the unfolding of the genetic program during development" that prevailed at the time. Many correlations between transcriptional activation and demethylation have since been reported, but causation has not been demonstrated and to date there is no reasonable proof of the existence of a complex biochemical system that activates and represses genes via reversible DNA methylation. Such a system would supplement or replace the conserved web of transcription factors that regulate cellular differentiation in organisms that have unmethylated genomes (such as Caenorhaditis elegans and the Dipteran insects) and those that methylate their genomes. DNA methylation does have essential roles in irreversible promoter silencing, as in the monoallelic expression of imprinted genes, in the silencing of transposons, and in X chromosome inactivation in female mammals. Rather than reinforcing or replacing regulatory pathways that are conserved between organisms that have either methylated or unmethylated genomes, DNA methylation endows genomes with the ability to subject specific sequences to irreversible transcriptional silencing even in the presence of all of the factors required for their expression, an ability that is generally unavailable to organisms that have unmethylated genomes.
Structure of Genomic Methylation PatternsThe addition of a fifth base (5-methylcytosine or m 5 C) increases the maximum potential information content of DNA from 2 bits per base pair to 2.32 bits; the addition of naturally occurring oxidized forms of m 5 C (5-hydroxymethycytosine, 5-formylcytosine, and 5-carboxylcytosine) increases the information content still further, although m 5 C is much more abundant than the oxidized derivatives. The assembled and annotated fraction of the human genome contains ∼29 million CpG dinucleotides, each of which can exist in the methylated or unmethylated state.