Caches are widely used in embedded systems to bridge the increasing speed gap between processors and off-chip memory. However, caches make it significantly harder to compute the worst-case execution time (WCET) of a task. To alleviate this problem, cache locking has been proposed. We investigate the WCETaware I-cache locking problem and propose a novel dynamic I-cache locking heuristic approach for reducing the WCET of a task. For a nonnested loop, our approach aims at selecting a minimum set of memory blocks of the loop as locked cache contents by using the min-cut algorithm. For a loop nest, our approach not only aims at selecting a minimum set of memory blocks of the loop nest as locked cache contents but also finds a good loading point for each selected memory block. We propose two algorithms for finding a good loading point for each selected memory block, a polynomial-time heuristic algorithm and an integer linear programming (ILP)-based algorithm, further reducing the WCET of each loop nest. We have implemented our approach and compared it to two state-of-the-art I-cache locking approaches by using a set of benchmarks from the MRTC benchmark suite. The experimental results show that the polynomial-time heuristic algorithm for finding a good loading point for each selected memory block performs almost equally as well as the ILP-based algorithm. Compared to the partial locking approach proposed in Ding et al. [2012], our approach using the heuristic algorithm achieves the average improvements of 33%, 15%, 9%, 3%, 8%, and 11% for the 256B, 512B, 1KB, 4KB, 8KB, and 16KB caches, respectively. Compared to the dynamic locking approach proposed in Puaut [2006], it achieves the average improvements of 9%,