Image feature detection and matching is a fundamental operation in image processing. As the detected and matched features are used as input data for high-level computer vision algorithms, the matching accuracy directly influences the quality of the results of the whole computer vision system. Moreover, as the algorithms are frequently used as a part of a real-time processing pipeline, the speed at which the input image data are handled is also a concern. The paper proposes an embedded system architecture for feature detection and matching. The architecture implements the FAST feature detector and the BRIEF feature descriptor and is capable of establishing key point correspondences in the input image data stream coming from either an external sensor or memory at a speed of hundreds of frames per second, so that it can cope with most demanding applications. Moreover, the proposed design is highly flexible and configurable, and facilitates the trade-off between the processing speed and programmable logic resource utilization. All the designed hardware blocks are designed to use standard, widely adopted hardware interfaces based on the AMBA AXI4 interface protocol and are connected using an underlying direct memory access (DMA) architecture, enabling bottleneckfree inter-component data transfers.