Unlabeled shape analysis is a rapidly emerging and challenging area of
statistics. This has been driven by various novel applications in
bioinformatics. We consider here the situation where two configurations are
matched under various constraints, namely, the configurations have a subset of
manually located "markers" with high probability of matching each other while a
larger subset consists of unlabeled points. We consider a plausible model and
give an implementation using the EM algorithm. The work is motivated by a real
experiment of gels for renal cancer and our approach allows for the possibility
of missing and misallocated markers. The methodology is successfully used to
automatically locate and remove a grossly misallocated marker within the given
data set.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS544 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org