Cache misses frequently exhibit repeated streaming behavior, i.e. a sequence of cache misses has a high tendency of being repeated. Correlation-based prefetchers record the missing streams in a history table for accurate prefetching. Saving a large miss history in off-chip DRAM is a practical implementation, but incurs access latency and consumes memory bandwidth which leads to performance degradation.In this paper, we investigate a new data prefetching mechanism based on per-block miss correlation where a miss is correlated with an earlier miss when the two misses are closely encountered both in time and space. The miss correlations are captured dynamically and saved along with the content of the data block using a simple data compression technique. As a result of this novel combination, our scheme provides unbounded correlation history and its prefetch metadata can be fetched together with demand data without incurring additional latency nor consuming any memory bandwidth. Performance evaluations using data-parallel applications demonstrate that prefetchers based on per-block miss correlations can improve IPC by 42-139% with an average of 88% compared to the IPC without prefetching. In comparison with regular stream prefetcher, sampled temporal streaming prefetcher and spatial-temporal memory streaming prefetcher, up to 115%, 99% and 98% IPC improvement can be obtained with an average about 36%, 26% and 27% respectively.