Based on acousto-electric modulation, ultrasound modulated electrical resistance tomography (UMERT) is expected to provide high spatial resolution by extracting more information about the conductivity distribution from data enriched by coupling impedance measurements to localized mechanical vibrations. A difference sensitivity matrix constructed from reference and the measured field is proposed for UMERT. Firstly, the difference sensitivity matrix, related to conductivity information of the measured field, can suppress the adverse influences of soft-field effects on image reconstruction, which usually causes relatively large errors in traditional electrical resistance tomography (ERT) image reconstruction. Secondly, the differential form adopted by the proposed sensitivity matrix reduces the effect of the feature of the nonlinearity of the electric field on the distribution of the sensitivity matrix, which is reflected in sensitivity with a relatively low value in the central area whilst with a high value in the boundary area. Finally, the differential form can also reduce the influences of systematic errors on measurement data and thus, further improve the spatial resolution of reconstructed images. In addition, three current excitation patterns are discussed in order to obtain the best sensitivity of boundary voltage variations to conductivity changes. The proposed sensitivity matrix and the corresponding reconstructed image results are compared with that based on Geselowitz’s sensitivity theorem in ERT and one constructed from the measured field in UMERT. Both theory and simulation results verify the feasibility of the proposed difference sensitivity matrix. The reconstructed images demonstrate higher spatial resolution, especially for the detection of small objects. It also has a stronger ability in identifying the size of the objects and noise immunity.