Increasing the image size of a video sequence aggravates the memory bandwidth problem of a video coding system. Despite many embedded compression (EC) algorithms proposed to overcome this problem, no lossless EC algorithm able to handle high-definition (HD) size video sequences has been proposed thus far. In this paper, a lossless EC algorithm for HD video sequences and related hardware architecture is proposed. The proposed algorithm consists of two steps. The first is a hierarchical prediction method based on pixel averaging and copying. The second step involves significant bit truncation (SBT) which encodes prediction errors in a group with the same number of bits so that the multiple prediction errors are decoded in a clock cycle. The theoretical lower bound of the compression ratio of the SBT coding was also derived. Experimental results have shown a 60% reduction of memory bandwidth on average. Hardware implementation results have shown that a throughput of 14.2 pixels/cycle can be achieved with 36K gates, which is sufficient to handle HD-size video sequences in real time.