Geomagnetically Induced Currents (GICs) are a severe space weather hazard, driven through coupling between the solar wind and magnetosphere. GICs are rarely measured directly, instead the ground magnetic field variability is often used as a proxy. Recently space weather models have been developed to forecast whether the magnetic field variability (R) will exceed specific, extreme thresholds. We test an example machine learning‐based model developed for the northern United Kingdom. We evaluate its performance (discriminative skill and calibration) as a function of magnetospheric state, solar wind input and magnetic local time. We find that the model's performance is highest during active conditions, for example, geomagnetic storms, and lowest during isolated substorms and “quiet” intervals, despite these conditions dominating the training data set. Correspondingly, the performance is high when the solar wind conditions are elevated (i.e., high velocity, large total magnetic field strength, and the interplanetary magnetic field oriented southward), and at a minimum when the north‐south component of the magnetic field is highly variable or around zero. Regarding magnetic local time, performance is highest within the dusk and night sectors, and lowest during the day. The model appears to capture multiple modes of magnetospheric activity, including substorms and viscous interactions, but poorly predicts impulsive phenomena (i.e., storm sudden commencements) and longer timescale coupling processes. Future models of mid‐latitude magnetic field variability will need to effectively use longer time intervals of unpropagated (i.e., observations from L1) solar wind to more completely describe the magnetospheric conditions and response.