Localization of moving military vehicles plays a vital role for border security and safeguarding high-security facilities. Commonly applied range-based localization techniques such as time of arrival, time difference of arrival, angle of arrival, and received signal strength rely on known transmitters. However, when seismic sensor networks are used for localization of moving targets, where moving targets can be treated as unknown transmitters. In this work, we consider a scenario where only receivers are deployed to perceive seismic signals transmitted by the moving military vehicles with unknown locations. Consequently, conventional closed-form equations for distance-based trilateration are not applicable. To address this challenge, we present a novel approach for accurate localization. Our method involves clustering closely deployed sensor nodes to effectively fuse their information to estimate the positions of the moving military vehicles. We leverage multiple-input convolutional neural networks, utilizing one input to represent the short-time discrete Fourier transform of signals from each node, and another input to encode the relative locations of sensors within clusters. Through extensive experimentation, we demonstrate that our proposed method significantly reduces localization errors when compared to existing distributed regression methods.