Parallel and distributed computing has been adapted to many scientific applications for aggregate computational power and memory capacity. To utilize resources efficiently and speed up application execution, tasks are split and dispatched across multiple computing elements. The ideal case is that all subtasks can finish roughly at the same time. However, this is not always achievable due to different and dynamic computing resources, workloads, and networks. Dynamic load balancing is in demand. This paper proposed a Mesh Closure Detection (MCD) scheme to speed up data/task partitioning process, narrow down the repartitioning space, and reduce the overhead of load balancing. Both data repartitioning and load balancing are accomplished locally. Without the requirement of global control, MCD can be adapted to decentralized systems, such as Peer-to-Peer systems to conduct multiple local load balancing events in different regions concurrently. Formal analyses are provided for its correctness. MCD is also applied to a real application, Relativistic Particle Transport simulation, to demonstrate its effectiveness and efficiency.