Real-time support for an iris recognition algorithm is a considerable challenge for a portable system that is commonly used in the field. In this paper, an efficient parallel and pipeline architecture design for the feature extraction and template matching processes in the Ridge Energy Direction (RED) algorithm for iris recognition is presented. Several techniques used in the proposed architecture design to reduce the computational complexity while supporting a high performance capability include (i) a circle approximation method for the iris unwrapping process, (ii) a parallel design with an on-chip buffer for 2D convolution in the feature extraction process, and (iii) an approximation method for log2 and inverse-log2 conversion in the template matching process. Performance analysis shows that the proposed architecture achieves a speedup of 881 times compared to the conventional method. The proposed design can be integrated with an embedded microprocessor to realize a complete system-on-chip solution for a portable iris recognition system.