Background: Hybrid imaging (e.g., positron emission tomography [PET]/ computed tomography [CT],PET/magnetic resonance imaging [MRI]) helps one to visualize and quantify morphological and physiological tumor characteristics in a single study. The noninvasive characterization of tumor heterogeneity is essential for grading, treatment planning, and following-up oncological patients. However, conventional (CONV) image-based parameters, such as tumor diameter, tumor volume, and radiotracer activity uptake, are insufficient to describe tumor heterogeneities. Here, radiomics shows promise for a better characterization of tumors. Nevertheless, the validation of such methods demands imaging objects capable of reflecting heterogeneities in multi-modality imaging. We propose a phantom to simulate tumor heterogeneity repeatably in PET,CT,and MRI. Methods: The phantom consists of three 50-ml plastic tubes filled partially with acrylic spheres of S1: 1.6 mm, S2: 50%(1.6 mm)/50%(6.3 mm), or S3: 6.3-mm diameter. The spheres were fixed to the bottom of each tube by a plastic grid, yielding one sphere free homogeneous region and one heterogeneous (S1, S2, or S3) region per tube. A 3-tube phantom and its replica were filled with a fluorodeoxyglucose (18F) solution for test-retest measurements in a PET/CT Siemens TPTV and a PET/MR Siemens Biograph mMR system. A number of 42 radiomic features (10 first order and 32 texture features) were calculated for each phantom region and imaging modality. Radiomic features stability was evaluated through coefficients of variation (COV) across phantoms and scans for PET,CT,and MRI.Further,the Wilcoxon test was used to assess the capability of stable features to discriminate the simulated phantom regions. Results: The different patterns (S1-S3) did present visible heterogeneity in all imaging modalities. However, only for CT and MRI, a clear visual difference was present between the different patterns. Across all phantom regions in PET, CT, and MR images, 10, 16, and 21 features out of 42 evaluated features in total had a COV of 10% or less. In particular, CONV, histogram, and gray-level run length matrix features showed high repeatability for all the phantom regions and imaging modalities. Several of repeatable texture features allowed the image-based discrimination of the different phantom regions (p < 0.05). However, depending