<p><em>Abstract</em>-The using of GPU for Monte Carlo particle transport is lacking of fair comparisons. This work performs simulations on both CPU and GPU in the same package under the same manufacturing process of low power mobile devices. The experiment with simple pincell benchmark problems with fresh fuel gives consistent results between CPU and GPU. In the meanwhile, it finds that the Apple M1 GPU is as twice capable as M1 CPU, while entitled with a 5 times advantage in power consumption. The particle sorting algorithm optimized for GPU improves computing efficiency by 28%, while prominently reducing GPU power consumption. Such advantage of sorting algorithm is expected to be greater for depleted fuel problems than fresh fuel problem. The kernel reconstruction Doppler broadening algorithm designed for continuously varying materials is demonstrated to produce consistent Doppler coefficients with the reference code and the algorithm can be efficiently implemented on GPU. Compared with the reference code with double precision floating point numbers, the testing codes with single precision floating point numbers could underestimate the K-effective values by about 500 pcm, and the Doppler coefficients of the fuel are well reproduced though. The conclusion may strengthen the argument that it is helpful for high performance computer to adopt GPU in order to reduce gross power consumption.</p>
<p>This work is an English translated version of an article submitted to the CORPHY 2022 conference(http://corphy2022.org.cn). Original text is in Chinese, and there may be minor difference in the contents. </p>
<p>This work is also an extension to the published article. https://www.sciencedirect.com/science/article/pii/S0306454922001852</p>