Computationally intensive digital signal processing (DSP) systems sometimes have real time requirements beyond that which programmable processor platform solutions, consisting of RISC and DSP processors, can achieve. The addition of Field Programmable Gate Array (FPGA) components to these platforms provides a configurable hardware resource where increased parallelism levels allow very large computational rates. Techniques to implement circuit architectures from signal flow graph (SFG) algorithm expression can produce highly efficient processor implementations. Applying folding transformations produces implementations where hardware resource usage is reduced at the possible expense of throughput. In this paper a new development methodology is presented which analyses the SFG algorithm to suggest appropriate folding techniques. By characterizing different folding techniques, a template circuit architecture can be created early in the design process which does not alter throughout the remainder of the implementation process. Retiming techniques applied to the algorithm SFG produces the properly timed implementation from the template. By applying this methodology, architectural exploration can be quickly and efficiently performed to generate a set of implementations (an 'implementation space') to best meet the constraints of the system. When applied to a Normalised Lattice Filter design example, the results demonstrate high savings on FPGA resource usage, with little reduction in real time performance, demonstrating the implementation advantage of employing this methodology.