An N-bit priority resolver having N inputs and N outputs functions as polling hardware in an embedded system, enabling access to a resource when multiple devices initiate access requests at its inputs which may be located on-chip or off-chip. Subsystems such as data buses, comparators, fixed- and floating-point arithmetic units, interconnection network routers, etc., utilize the priority resolver function. In the literature, there are many transistor-level designs for the priority resolver based on dynamic CMOS logic, some of which are modular and others are not. This article presents a novel gate-level modular design of priority resolvers that can accommodate any number of inputs and outputs. Based on our modular design architecture, small-size priority resolvers can be conveniently combined to form medium- or large-size priority resolvers along with extra logic. The proposed modular design approach helps to reduce the coding complexity compared to the conventional direct design approach and facilitates scalability. We discuss the gate-level implementation of 4-, 8-, 16-, 32-, 64-, and 128-bit priority resolvers based on the direct and modular approaches and provide a performance comparison between these based on the design metrics. According to the modular approach, different sizes of priority resolver modules were used to implement larger-size priority resolvers. For example, a 4-bit priority resolver module was used to implement 8-, 16-, 32-, 64-, and 128-bit priority resolvers in a modular fashion. We used a 28 nm CMOS standard digital cell library and Synopsys EDA tools to synthesize the priority resolvers. The estimated design metrics show that the modular approach tends to facilitate increasing reductions in delay and power-delay product (PDP) compared to the direct approach, especially as the size of the priority resolver increases. For example, a 32-bit modular priority resolver utilizing 16-bit priority resolver modules had a 39.4% reduced delay and a 23.1% reduced PDP compared to a directly implemented 32-bit priority resolver, and a 128-bit modular priority resolver utilizing 16-bit priority resolver modules had a 71.8% reduced delay and a 61.4% reduced PDP compared to a directly implemented 128-bit priority resolver.