Correct application of masking on hardware implementation of cryptographic primitives necessitates the instantiation of registers in order to achieve the non-completeness (commonly said to stop the propagation of glitches). This sometimes leads to a high latency overhead, making the implementation not necessarily suitable for the underlying application. As a concrete example, this holds for Keccak. Application of d + 1 Domain Oriented Masking (DOM) on a round-based implementation of Keccak leads to the introduction of two register stages per round, i.e., two times higher latency. On the other hand, Rhythmic-Keccak, introduced in CHES 2018, unrolls two rounds to half the latency compared to an unprotected ordinary round-based implementation. To that end, td + 1 masking is used which requires a notable area, and – apart from the difficulty to construct – its extension to higher orders seems beyond the bounds of feasibility.In this paper, we focus on d + 1 masking and introduce a methodology which enables us to stay with the latency of an unprotected round-based implementation, i.e., one register stage per round. While being secure under glitch-extended probing model, we provide a general design where the desired security order can be easily adjusted without any effect on the above-given latency. Compared to the Rhythmic-Keccak, the synthesis results show that our first-order design is able to accomplish the entire operations of Keccak-f[200] in the same period of time while decreasing the area by 74.5%. Notably, our implementations achieve around 30% less delay compared to the corresponding original DOM-Keccak designs.