Debugging memory test failures in a system-on-chip design is becoming difficult due to the growing number and sizes of the embedded memories. Low-complexity marching tests, which are ideally suited for production testing, are insufficient for debug and diagnostics. On-chip support for multiple memory test algorithms can be prohibitively expensive. Moreover, memory test engineers would like the flexibility to make small changes to the test sequence. Runtime programmability can be provided through the use of programmable finite state machines and/or microcode in the BIST controllers. Since such controllers have higher area requirement, it is difficult to employ multiple controllers and distribute them geographically on the chip. Therefore, the BIST controller can become a routing hot-spot. Existing memory BIST insertion flows operate on a post-synthesis net-list and ignore the constraints that will be posed by the physical design step that will follow. These constraints include routing congestion and interconnect timing. Similarly, the synthesis of the BIST logic must also address area, test application time and test power constraints. In this paper, we formulate the problem of programmable memory BIST synthesis as an optimization problem and describe an implementation. Results show upto 3X improvement in area and wirelength for industrial designs when a layout-aware flow is used as opposed to manual BIST implementation.