Chromosomes consist of various domains with different transcriptional activities separated by chromatin boundary sequences such as insulator sequences. Recent studies suggested that CTCF or other chromatin loop-forming protein binding sequences represented typical insulators. Alternatively, some long nucleosome-excluding DNA sequences were also reported to exhibit insulator activities in yeast and sea urchin chromosomes although specific binding of loop-forming proteins were not expected for them. However, the mechanism of the insulator activities of these sequences and the possibilities of similar insulators existing in other organisms remained unclear. In this study, we first constructed and performed simulations of a coarse-grained chromatin model containing nucleosome-rich and nucleosome-excluding DNA regions. We found that a long nucleosome-excluding region between two nucleosome-rich regions could markedly hinder the associations of two neighboring chromatin regions owing to the stronger long-term-averaged rigidity of the nucleosome-excluding region compared to that of nucleosome-rich regions. Subsequent analysis of the genome wide nucleosome positioning, protein binding, and DNA rigidity in human cells revealed that some nucleosome-excluding rigid DNA sequences without bound chromatin looping proteins could exhibit insulator activities, functioning as chromatin boundaries in various regions of human chromosomes.the four nucleotide sequences, 84 we defined the rigidity index of each sequence by averaging the rigidities of four sequential nucleotide series. The maximum, central, and minimum values of the rigidities in this data set were 1.9, 11.9, and 27.2. As the rigidity index of ArsInsC sequence was estimated as 14.7, we focused on sequences with rigidity indices > 15 as NENLIS candidate sequences (NENLISc). Through these procedures, we collected the NENLISc in GM12878 cells.
Evaluation of the Insulator Activities of NENLISc in Human Cells.To evaluate the insulator activities (the ability to function as a chromatin domain boundary) of NENLISc, we measured the contact probabilities among loci belonging to the 10 kbp up-and downstream chromatin regions of NENLISc using Hi-C data of GM12878 cells with 1 kbp resolution (GEO: GSE63525). We defined !" ! as the contact probability between the s-th and t-th loci around a NENLISc. Here, s and t were the relative positions from the focused NENLISc given as s, t = -10, -9, … , −2, −1, 1, 2, … ,10 (kbp). We considered the case that Hi-C data provides the contact probability between two loci such that the relative positions from the central locus of the focused NENLISc to the loci are given as s' and t', respectively. Then, !" ! is given by the contact probability between the s'-th and t'-th loci wherein the sign of s' (t') and s (t) are the same and |s'| (|t'|) is smaller than |s| (|t|) and larger than {|s| −1 kbp} ({|t| −1 kbp}). We also measured the contact probabilities between the s-th and t-th loci around 11,000 (5000 from each of 22