2015
DOI: 10.1137/140962486
|View full text |Cite
|
Sign up to set email alerts
|

String Reconstruction from Substring Compositions

Abstract: Motivated by mass-spectrometry protein sequencing, we consider a simply-stated problem of reconstructing a string from the multiset of its substring compositions. We show that all strings of length 7, one less than a prime, or one less than twice a prime, can be reconstructed uniquely up to reversal. For all other lengths we show that reconstruction is not always possible and provide sometimes-tight bounds on the largest number of strings with given substring compositions. The lower bounds are derived by combi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
90
0

Year Published

2019
2019
2025
2025

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 56 publications
(90 citation statements)
references
References 16 publications
0
90
0
Order By: Relevance
“…We answer both questions affirmatively by describing a coding scheme that allows for unique reconstruction and correction of a single deletion-insertion mass error. Encoding is performed by interleaving symmetric strings with Catalantype paths, while decoding is accomplished through a modification of the backtracking decoding algorithm described in [7]. Our work extends the existing literature in coded string reconstruction [9], [10].…”
Section: Introductionmentioning
confidence: 97%
See 4 more Smart Citations
“…We answer both questions affirmatively by describing a coding scheme that allows for unique reconstruction and correction of a single deletion-insertion mass error. Encoding is performed by interleaving symmetric strings with Catalantype paths, while decoding is accomplished through a modification of the backtracking decoding algorithm described in [7]. Our work extends the existing literature in coded string reconstruction [9], [10].…”
Section: Introductionmentioning
confidence: 97%
“…In an earlier line of work, the authors of [7] introduced the problem of binary string reconstruction from its substring composition multiset to address the issue of MS/MS readout analysis. The substring composition multiset of a binary string is obtained by writing out all substrings of the string of all possible length and then representing each substring by its composition.…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations