Large-scale data obtained from the aggregation of already collected multi-site neuroimaging datasets has brought benefits such as higher statistical power, reliability, and robustness to the studies. Despite these promises from growth in sample size, substantial technical variability stemming from differences in scanner specifications exists in the aggregated data and could inadvertently bias any downstream analyses on it. Such a challenge calls for data normalization and/or harmonization frameworks, in addition to comprehensive criteria to estimate the scanner-related variability and evaluate the harmonization frameworks. In this study, we propose MISPEL (Multi-scanner Image harmonization via Structure Preserving Embedding Learning), a supervised multi-scanner harmonization method. Unlike existing techniques, MISPEL does not assume a perfect coregistration across the scans, and the framework is naturally extendable to more than two scanners. We also designed a set of comprehensive criteria to investigate the scanner-related technical variability and evaluate the harmonization techniques. As an essential requirement of our criteria, we introduced a multi-scanner matched dataset of four 3T MRI T1 images, which, to the best of our knowledge is the first dataset of its kind. We also investigated our evaluations using two popular segmentation frameworks: FSL and segmentation in statistical parametric mapping (SPM). Lastly, we compared MISPEL to popular methods of normalization and harmonization, namely White Stripe, RAVEL, and CALAMITI. MISPEL outperformed these methods and is promising for many other neuroimaging modalities.