The element specificity of soft X-ray spectroscopy makes it an ideal tool for analyzing the microscopic origin of ultrafast dynamics induced by localized optical excitation in metal-insulator heterostructures. Using [Fe/MgO]n as a model system, we perform ultraviolet pump/soft X-ray probe experiments, which are sensitive to all constituents of these heterostructures, to probe both electronic and lattice excitations. Complementary ultrafast electron diffraction experiments independently analyze the lattice dynamics of the Fe constituent, and together with ab initio calculations yield comprehensive insight into the microscopic processes leading to local relaxation within a single constituent or non-local relaxation between two constituents. Besides electronic excitations in Fe, which are monitored at the Fe L3 absorption edge and relax within 1 ps by electron-phonon coupling, soft X-ray analysis identifies a change at the oxygen K absorption edge of the MgO layers which occurs within 0.5 ps. This ultrafast energy transfer across the Fe-MgO interface is mediated by high-frequency, interface vibrational modes, which are excited by hot electrons in Fe and couple to vibrations in MgO in a mode-selective, non-thermal manner. A second, slower timescale is identified at the oxygen K pre-edge and the Fe L3 edge. The slower process represents energy transfer by acoustic phonons and contributes to thermalization of the entire heterostructure. We thus find that the interfacial energy transfer is associated with non-equilibrium behavior in the phonon system. Because our experiments lack signatures of charge transfer across the interface, we conclude that phonon-mediated processes dominate the competition of electronic and lattice excitations in these non-local, non-equilibrium dynamics.