In the field of empirical asset pricing, the challenges of high dimensionality, non-linear relationships, and interaction effects have led to the increasing popularity of machine learning (ML) methods. This study investigates the performance of ML methods when predicting different measures of stock returns from various factor models and investigates the feature importance and interaction effects among firm-specific variables and macroeconomic factors in this context. Our findings reveal that neural network models exhibit consistent performance across different stock return measures when they rely solely on firm-specific characteristic variables. However, the inclusion of macroeconomic factors from the financial market, real economic activities, and investor sentiment leads to substantial improvements in the model performance. Notably, the degree of improvement varies with the specific measures of stock returns under consideration. Furthermore, our analysis indicates that, after the inclusion of macroeconomic factors, there is a dissimilarity in model performance, variable importance, and interaction effects among macroeconomic and firm-specific variables, particularly concerning abnormal returns derived from the Fama–French three- and five-factor models compared with excess returns. This divergence is primarily attributed to the extent to which these factor models remove the variance associated with the macroeconomic variables. These findings collectively offer valuable insights into the efficacy of neural network models for stock return predictions and contribute to a deeper understanding of the intricate relationship between factor models, stock returns, and macroeconomic conditions in the domain of empirical asset pricing.