The neural basis of action understanding in humans remains disputed, with some research implicating the putative mirror neuron system (MNS) and some a mentalizing system (MZS) for inferring mental states. The basis for this dispute may be that action understanding is a heterogeneous construct: actions can be understood from sensory information about body movements or from language about action, and with the goal of understanding the implementation ("how") or motive ("why") of an action. Although extant research implicates the MNS in understanding implementation and the MZS in understanding motive, it remains unknown to what extent these systems subserve modality-specific or supramodal functions in action understanding. While undergoing fMRI, 21 volunteers considered the implementation ("How is she doing it?") and motive ("Why is she doing it?") for actions presented in video or text. Bilateral parietal and right frontal areas of the MNS showed a modality-specific association with perceiving actions in videos, while left-hemisphere MNS showed a supramodal association with understanding implementation. Largely left-hemisphere MZS showed a supramodal association with understanding motive; however, connectivity among the MZS and MNS during the inference of motive was modality specific, being significantly stronger when motive was understood from actions in videos compared to text. These results support a tripartite model of MNS and MZS contributions to action understanding, where distinct areas of the MNS contribute to action perception ("perceiving what") and the representation of action implementation ("knowing how"), while the MZS supports an abstract, modality-independent representation of the mental states that explain action performance ("knowing why").