In this paper we presents a systematic approach to design hardware circuits for bit-dimension permutations. The proposed approach is based on decomposing any bit-dimension permutation into elementary bit-exchanges. Such decomposition is proven to achieve the theoretical minimum number of delays required for the permutation. This offers optimum solutions for multiple well-known problems in the literature that make use of bit-dimension permutations. This includes the design of permutation circuits for the fast Fourier transform (FFT), bit reversal, matrix transposition, stride permutations and Viterbi decoders.