This work proposes a highly tunable motion estimation architecture. We implement the Horn and Schunck algorithm with the hierarchical extension for larger motion estimations in FPGAs. Different architectures are explored dealing with interpolation, pipeline, parallelism and arithmetic format, in order to fit performance. We show in our exploration, how the different cores of our system should be used to increase the throughput. Our smallest design achieves a 30.8 Mpixel/s in a 1024x1024 resolution and the fastest 507 Mpixel/s which is one of the fastest ever achieved, as far as we know, for FPGAs.