We study statistical inference for small-noise-perturbed multiscale dynamical systems under the assumption that we observe a single time series from the slow process only. We construct estimators for both averaging and homogenization regimes, based on an appropriate misspecified model motivated by a second-order stochastic Taylor expansion of the slow process with respect to a function of the time-scale separation parameter. In the case of a fixed number of observations, we establish consistency, asymptotic normality, and asymptotic statistical efficiency of a minimum contrast estimator (MCE), the limiting variance having been identified explicitly; we furthermore establish consistency and asymptotic normality of a simplified minimum constrast estimator (SMCE), which is however not in general efficient. These results are then extended to the case of high-frequency observations under a condition restricting the rate at which the number of observations may grow vis-à-vis the separation of scales. Numerical simulations illustrate the theoretical results.