Lossless image compression techniques shrink the image size to improve the transmission efficiency and reduce the occupied storage space while ensuring the quality of the image is lossless. Among them, the LOCO‐I/JPEG‐LS algorithm benefits high lossless compression ratio and low computational complexity and thus is widely used for various real‐time applications. However, due to the problems of the context dependency in the LOCO‐I, the parallelism in the algorithm is greatly constrained, which significantly limits the throughput and the real‐time performance of hardware implementations. Existing designs achieve more parallelism by using a lot of hardware costs or straightforward chunking with losing compression ratio. In order to trade off the parallelism and the compression ratio, this paper proposes a chunk‐oriented error modeling scheme for LOCO‐I, which enables parallelism in both compression and decompression and achieves a better compression ratio in chunks. Based on the optimized algorithm, a high‐throughput flexible lossless compression and decompression architecture (HFCD) is proposed, which achieves higher pixel per clock (PPC) with less hardware cost. Additionally, HFCD introduces a parameter sharing mechanism to enable random access of image chunks to improve the flexibility for decompression. Experimental results show that, compared with state‐of‐the‐art works, HFCD achieves 3.02–13.50 times improvement for the PPC of compression. For decompression, benefiting from our optimizations, HFCD achieves 22.4 times speedup compared to the software solution.