One important and sometimes contentious challenge in paleobiology is discriminating between species, which is increasingly accomplished by comparing specimen shape. While lengths and proportions are needed to achieve this task, finer geometric information, such as concavity, convexity, and curvature, plays a crucial role in the undertaking. Nonetheless, standard morphometric methodologies such as landmark analysis are not able to capture in a quantitative way these features and other important fine-scale geometric notions.Here we develop and implement state-of-the-art techniques from the emerging field of computational geometry to tackle this problem with the Mississippian blastoid Pentremites. We adapt a previously known computational framework to produce a measure of dissimilarity between shapes. More precisely, we compute “distances” between pairs of 3D surface scans of specimens by comparing a mix of global and fine-scale geometric measurements. This process uses the 3D scan of a specimen as a whole piece of data incorporating complete geometric information about the shape; as a result, scans used must accurately reflect the geometry of whole, undamaged, undeformed specimens. Using this information we are able to represent these data in clusters and ultimately reproduce and refine results obtained in previous work on species discrimination. Our methodology is landmark free, and therefore faster and less prone to human error than previous landmark-based methodologies.
We introduce a nonparametric way to estimate the global probability density function for a random persistence diagram. Precisely, a kernel density function centered at a given persistence diagram and a given bandwidth is constructed. Our approach encapsulates the number of topological features and considers the appearance or disappearance of features near the diagonal in a stable fashion. In particular, the structure of our kernel individually tracks long persistence features, while considering features near the diagonal as a collective unit. The choice to describe short persistence features as a group reduces computation time while simultaneously retaining accuracy. Indeed, we prove that the associated kernel density estimate converges to the true distribution as the number of persistence diagrams increases and the bandwidth shrinks accordingly. We also establish the convergence of the mean absolute deviation estimate, defined according to the bottleneck metric. Lastly, examples of kernel density estimation are presented for typical underlying datasets.
We develop here an algorithmic framework for constructing consistent multiscale Laplacian eigenfunctions (vectors) on data. Consequently, we address the unsupervised machine learning task of finding scalar functions capturing consistent structure across scales in data, in a way that encodes intrinsic geometric and topological features. This is accomplished by two algorithms for eigenvector cascading. We show via examples that cascading accelerates the computation of graph Laplacian eigenvectors, and more importantly, that one obtains consistent bases of the associated eigenspaces across scales. Finally, we present an application to TDA mapper, showing that our multiscale Laplacian eigenvectors identify stable flair-like structures in mapper graphs of varying granularity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.