The H.264/AVC standard allows for a high compression efficiency at the cost of computational complexity. To achieve as high as possible efficiency, the proposed architecture supports the mode selection based on the rate-distortion optimization. In particular, the dataflow assumes throughput of 32 samples/coefficient per clock cycle, on average, allowing a lot of compression options to be checked. Moreover, the architecture supports all transform sizes specified for High Profile using the same hardware resources. Synthesis results show that the design can work at 100 MHz for FPGA Stratix II devices.