Magnetic materials play a crucial role in the transition to more sustainable forms of energy and electric vehicles. There is an anticipated shortage in magnetic materials in the future, and as a result there is an urgent need to discover and design new magnetic materials. Computational magnetic material design using density functional theory is daunting because of the challenge in identifying magnetic ground states from a combinatorially large set of possibilities. Machine learning offers a path forward by enabling efficient surrogate models that can more readily enumerate these states, but there is a dearth of training data available, and what is available tends to be imbalanced with too much non-magnetic data. In this work we show that the discrete and previously tackled data imbalance that exists at the level of the magnetic ordering leads to an imbalanced continuous distribution with many zeros when the data is unraveled at the atomic magnetic moment level, which subsequently leads to models with low accuracy for magnetic properties. We mitigate this by using a two-part model framework. Our scheme is able to classify atoms into magnetic and non-magnetic with an F1 score and Matthew's Correlation Coefficient (MCC) of $\sim$ 91 \% and then to provide an implicit embedding representation that maps directly onto the magnitude of the magnetic moment with a Mean Absolute Error (MAE) of 0.1 $\mu_{\text{B}}$. Beyond screening for new magnetic materials, we demonstrate an additional practical use case of our scheme: the provision of good initial guesses for magnetic moments in first-principles electronic relaxations. Such initialization is shown to lead to faster convergence to configurations that lie closer to the ground state.