11Summary: CRISPR-based methods for genome, epigenome editing and imaging have provided 12 powerful tools to interrogate the functions of the genome. The design of guide RNA (gRNA) is a 13 vital step of CRISPR experiments. We report here the implementation of JACKIE (Jackie and 14 Albert's CRISPR K-mer Instances Enumerator), a pipeline for enumerating all potential single-15 and multi-copy CRISPR sites in the genome. We demonstrate the application of JACKIE to 16 identify locus-specific repetitive sequences for CRISPR/Casilio-based genomic labeling. 17 18 Availability: Source codes and CRISPR site databases (JACKIEdb) for hg38 and mm10 are 19 available for download at http://crispr.software/JACKIE 20 21 118 ./, and score field recording the copy number.119 Parallelization is achieved by either starting separate job on individual <6merPrefix>.bin files, or 120 by starting batch jobs focusing on 2mer prefices (e.g., AA AT AC AG operating on AA*.bin, 121 AT*.bin, AC*.bin, AG*.bin, etc). The outputs (<6merPrefix>.bed) are then merged into a 122 combined bed file using cat unix command. Helper scripts are provided for downstream 123 processing of the bed file into collapsed bed file with each record encoding all binding sites of 124 6 the same sequence in the same chromosome, for extracting binding sites within defined 125 regions, or for running the Cas-OFFinder program to identify off-target profiles of selected 126 sequences. 127 Fig 2. Potential applications of JACKIE pipeline. (a) JACKIE can be used to identify 128 clustered CRISPR binding sites with high copy number or low copy number for CRISPR-based 129 genomic imaging experiments. (b) To demonstate genomic imaging use case, we filtered 130 sgRNA sequences with >30 copies clustered within specific <10kb regions and performed 131 Casilio imaging experiment using designed sgRNAs. UCSC genome browser screenshots are 132 shown on the left with the line-bar annotating the CRISPR sites, and fluorescence microscopy 133 images shown on the right. Microscopy images were derived by merging green (Casilio-134 labeling) and blue (DAPI: nucleus stain). Arrows point to fluorescent loci of interest. Scale bars 135 indicate 5μm. (c) JACKIE can be used to identify clustered CRISPR binding sites for synergistic 136 activation or repression of target cis-regulatory elements (e.g., promoters) using CRISPR-based 137 epigenetic editing approaches (e.g., CRISPRa or CRISPRi). (d) JACKIE can identify unique 138 CRISPR binding sites in the genome for precise indel induction, dual CRISPR sites for targeted 139 deletion, or clustered sites spanning a genomic region to induce complex rearrangement 140 events.