Amid the different approaches to rigorously model the scattering of electromagnetic waves in sub-wavelength photolithography process, waveguide methods have been proven to be accurate and efficient. While totally different from the time domain methods such as FEM and TDFD, waveguide methods simultaneously compute diffraction of the incident plane waves with different incidence angles by a periodical dielectric structure. EM modes inside dielectric layers are solved for by decoupling the eigen-system and the electromagnetic field boundary conditions between each two adjacent layers are applied to stitch the modes and obtain the full scattering matrix. It is clear that the number of orders of the plane waves under consideration directly affects the accuracy of a waveguide simulator. However, in a 3-D simulation, simulation time and memory usage increase drastically with the increase of the number of orders. These limitations prevent waveguide methods from being applied to large layout patterns that require higher simulation orders. Since many cases under study in lithography process optimization and layout printability analysis are actually axial-symmetric about both x and y axes, it is possible to use this symmetry to simplify the calculation. Based on our definitions of the four fundamental groups of symmetric and anti-symmetric functions and the operations between these symmetric groups, we have made some fundamental discoveries of the waveguide propagating behaviors inside symmetric dielectric structures. In this paper, we will describe our new rigorous theory to decompose the propagating waves in a symmetric layer into four symmetric and anti-symmetric transmitting plane wave groups. Each group can be directly stitched to the next layer without losing symmetry or anti-symmetry. Furthermore, any form of incident waves can be reorganized into the above four plane wave groups, so can the reflective and refraction fields. We have shown that by using this field decomposition technique, we can compute scattering of the four smaller diffraction mode groups separately and the computational complexity will be greatly reduced as compared with the full mode simulation. We have developed a new version of the 3-D simulator, METRO, which incorporated this new method and achieved very high computational efficiency. For example, a 33 by 33-order, double precision simulation will only require 300M bytes memory and take less than 15 minutes without any loss of accuracy on a 1G processor compared with 1.7G memory usage and 10 hours simulating time on the same processor without symmetric simplification [1].