<p>Piecewise polynomial approximation on non-linear functions plays an important role in neural network accelerators and digital signal processing. In this paper, we proposed QPA, a quantization-aware piecewise polynomial approximation methodology, to generate the optimized coefficients for hardware implementations targeting any polynomial order. QPA incorporated several key features to minimize the fitting error and the hardware cost, including using the Remez algorithm to compute the min-max fitting polynomial, combining the fitting and quantization operations to get an error-flattened characteristic, assigning specific coefficient bit width to each multiplier to reduce the hardware cost, and fine-tuning the truncated coefficients to further reduce the fitting error. We applied the proposed methodology to piecewise linear (PWL) and piecewise quadratic (PWQ) approximations. Experimental results showed that QPA consistently achieved the lowest fitting error compared with the state-of-the-art methods. We synthesized the proposed designs with 28nm TSMC CMOS technology. The synthesis results showed the proposed designs achieved up to 43.8% area reduction and 37.5% fitting error reduction compared to state-of-the-art PWL designs, up to 22.1% area reduction and 33.1% fitting error reduction compared to state-of-the-art PWQ designs respectively.</p>