“…First, a dummy identity vector unique to each target was generated for use in control experiments; hence the collection of dummy vectors forms an identity matrix. Second, protein primary sequence subsequence frequencies (“k-mers”) were tabulated for lengths one, two, and three [33]. For multimer frequency calculation, the overlapped sliding window approach was used (e.g., for sequence MKSLP in MMP3, computation of a 2-mer is a vector of length 4 and a 3-mer is of length 3 comprised of subsequences MKS, KSL, SLP each occurring once).…”