Digital twins in the mechanics of materials usually involve multimodal data in the sense that an instance of a mechanical component has both experimental and simulated data. These simulations aim not only to replicate experimental observations but also to extend the data. Whether spatially, temporally, or functionally, augmentation is needed for various possible uses of the components to improve the predictions of mechanical behavior. Related multimodal data are scarce, high-dimensional and a physics-based causality relation exists between observational and simulated data. We propose a data augmentation scheme coupled with data pruning, in order to limit memory requirements for high-dimensional augmented data. This augmentation is desirable for digital twining assisted by artificial intelligence when performing nonlinear model reduction. Here, data augmentation aims at preserving similarities in terms of the validity domain of reduced digital twins. In this article, we consider a specimen subjected to a mechanical test at high temperature, where the as-manufactured geometry may impact the lifetime of the component. Hence, an instance is represented by a digital twin that includes 3D X-Ray tomography data of the specimen, the related finite element mesh, and the finite element predictions of thermo-mechanical variables at several time steps. There is, thus, for each specimen, geometrical and mechanical information. Multimodal data, which couple different representation modalities together, are hard to collect, and annotating them requires a significant effort. Thus, the analysis of multimodal data generally suffers from the problem of data scarcity. The proposed data augmentation scheme aims at training a recommending system that recognizes a category of data available in a training set that has already been fully analyzed by using high-fidelity models. Such a recommending system enables the use of a ROM-net for fast lifetime assessment via local reduced-order models.