Radio interferometers composed of a large array of small antennas posses large fields of view, coupled with high sensitivities. For example, the Karoo Array Telescope (MeerKAT) achieves a gain of up to 2.8 K/Jy across its $>1\ deg ^2$ field of view. This capability significantly enhances the survey speed for pulsars and fast transients. It also introduces challenges related to the high data rate, which reaches a few Tb/s for MeerKAT, and it requires substantial computing power. To handle the high data rate of surveys, we have developed a high-performance single-pulse search software called "TransientX". This software integrates multiple processes into one pipeline, which includes radio-frequency interference mitigation, dedispersion, matched filtering, clustering, and candidate plotting. In TransientX we developed an efficient CPU-based dedispersion implementation using the sub-band dedispersion algorithm. Additionally TransientX employs the density-based spatial clustering of applications with noise (DBSCAN) algorithm to eliminate duplicate candidates, using an efficient implementation based on the kd-tree data structure. We also calculate the decrease of signal-to-noise ratio resulting from dispersion measure, boxcar width, spectral index, and pulse-shape mismatches. Remarkably, we find that the decrease of signal-to-noise ratio resulting from the mismatch between a boxcar-shaped template and a Gaussian-shaped pulse with scattering remains relatively small, at approximately 9<!PCT!>, even when the scattering timescale is ten times that of the pulse width. Additionally, the decrease in the signal-to-noise ratio resulting from the spectral index mismatch becomes significant with multi-octave receivers. We have benchmarked the individual processes, including dedispersion, matched filtering, and clustering. Our dedispersion implementation can be executed in real time using a single CPU core on data with 4096 dispersion measure trials, which consist of 4096 channels and have a time resolution of 153 microseconds. Overall TransientX offers the capability for efficient CPU-only real-time single-pulse searching.