This paper proposes an efficient one-dimensional (1-D) N-point discrete cosine and inverse discrete cosine transform (DCT/IDCT) architectures using sub-band decomposition algorithm. Based on the row-column decomposition technique, the two-dimensional (2-D) N×N DCT/IDCT architecture with two successive 1-D DCT/IDCT processors and one transpose memory is proposed. The orthonormal property of DCT/IDCT transformation matrices is fully used to simplify the hardware complexities. The proposed architectures with computation complexity O(5N/8) and O(3N/8) for DCT and IDCT, respectively, and low hardware complexity O(3N/8) for both DCT and IDCT are fully pipelined and scalable for variable-length 2-D DCT/IDCT computation.