“…The substring composition multiset, Cpsq of a string s is the multiset of compositions of all possible substrings of the string s. As an illustration, the set of all substrings of 001 is t0, 0, 1, 00, 01, 001u, and the substring composition multiset of 001 equals t0 1 , 0 1 , 1 1 , 0 2 , 0 1 1 1 , 0 2 1 1 u. Two modeling assumptions are used in [7] and subsequent works [8]- [10]: a) Using MS/MS measurements, one can uniquely infer the composition of a polymer substring from its mass; and b) When a polymer is broken down for mass spectrometry analysis, the masses of all its substrings are observed with identical frequency.…”