Deploying modern neural networks on resource-constrained edge devices necessitates a series of optimizations to ready them for production. These optimizations typically involve pruning, quantization, and fixed-point conversion to compress the model size and enhance energy efficiency. While these optimizations are generally adequate for most edge devices, there exists potential for further improving the energy efficiency by leveraging special-purpose hardware and unconventional computing paradigms. In this study, we explore stochastic computing neural networks and their impact on quantization and overall performance concerning weight distributions. When arithmetic operations such as addition and multiplication are executed by stochastic computing hardware, the arithmetic error may significantly increase, leading to a diminished overall accuracy. To bridge the accuracy gap between a fixed-point model and its stochastic computing implementation, we propose a novel approximate arithmetic-aware training method. We validate the efficacy of our approach by implementing the LeNet-5 convolutional neural network on an FPGA. Our experimental results reveal a negligible accuracy degradation of merely 0.01% compared with the floating-point counterpart, while achieving a substantial 27× speedup and 33× enhancement in energy efficiency compared with other FPGA implementations. Additionally, the proposed method enhances the likelihood of selecting optimal LFSR seeds for stochastic computing systems.