We propose in this paper an efficient FALCON accelerator called EFX based on a HW/SW co-design where FALCON is a post-quantum cryptographic (PQC) scheme tailored as a digital signature algorithm (DSA). Our findings reveal that FALCON exhibits unique characteristics and structures which distinguish it from other PQC-DSAs. A key finding is that, unlike its counterparts, FALCON doesn't prioritize a single, time-consuming task; instead, it processes a variety of tasks with comparable execution times. Consequently, the conventional methods focusing on accelerating dominant few tasks, which are generally effective for other algorithms, prove less efficient for FALCON, especially concerning the minimization of the silicon area used. To overcome this, we strategically focus on the granular optimization of lower-level operations rather than on broader functional segments, aiming to boost performance while conserving hardware space. Moreover, to mitigate the potential degradation due to limitation of hardware resources, we have implemented a pipelined execution strategy for the FALCON functions and refined the sampling function-a critical task that is challenging to accelerate due to inherent sequential algorithmenabling it to run concurrently on both software and hardware, thus reducing latency. Our hardware design, synthesized at 300M Hz using Samsung's 28nm and 45nm process technologies, demonstrates superior performance in generating FALCON signatures, with a 3.58× improvement in clock cycles over an existing hardware accelerator. EFX occupies 38K um 2 and 74K um 2 for 28nm and 45nm processes, respectively, comparatively small compared to other PQC accelerators.