SUMMARYGrid applications are normally deployed on computing nodes beforehand, which may cause the undesirable situation that some of these nodes (with hot applications deployed) are always busy whereas the others are consistently idle. Therefore, the overall performance (e.g. throughput and load balancing) of such a Grid system would be seriously degraded. In this paper, we present the idea of Hierarchical and Dynamic Deployment of Application (HDDA) in Grid to improve the system performance. With HDDA, an application can be dynamically deployed and undeployed when necessary. In order to reduce the overhead caused by HDDA, the Average Latency Ratio Minimum (ALR-MIN) replacement strategy is also proposed. It deploys applications to nodes with minimum ALR of Node (NALR), and evicts applications with minimum increment of ALR. The results of the experiment we conducted on ChinaGrid show that HDDA can achieve 10 and 24% less average complete time (ACT) than the schemes of non-HDDA and Static Deployment of Application (SDA), respectively. Additionally, throughput and load balancing of HDDA are also better than the other two schemas. Results of the simulation performed on a simulator particularly developed for this research show that our ALR-MIN replacement strategy results in 17% less relative delay-time of jobs than the well-known Least Recently Used (LRU)-based strategies in a typical setting.