In the mobile edge computing (MEC) environment, edge servers with storage and computing resources are deployed at base stations within users' geographic proximity to extend the capabilities of cloud computing to the network edge. Edge storage system (ESS), is comprised by connected edge servers in a specific area, which ensures low-latency services for users. However, high data storage overheads incurred by edge servers' limited storage capacities is a key challenge in ensuring the performance of applications deployed on an ESS. Data deduplication, as a classic data reduction technology, has been widely applied in cloud storage systems. It also offers a promising solution to reducing data redundancy in ESSs. However, the unique characteristics of MEC, such as edge servers' geographic distribution and coverage, render cloud data deduplication mechanisms obsolete. In addition, data distribution must be balanced over edge storage systems to accommodate future data demands, which cannot be undermined by data deduplication. Thus, balanced edge data deduplication (BEDD) must consider deduplication ratio, data storage benefits, and resource balance systematically under the latency constraint. In this article, we model the novel BEDD problem formally and prove its N P-hardness. Then, we propose an optimal approach for solving the BEDD problem exactly in small-scale scenarios and a sub-optimal approach to solve large-scale BEDD problems with a theoretical performance guarantee. Extensive and comprehensive experiments conducted on a real-world dataset demonstrate the significant performance improvements of our approaches against four representative approaches.