“…As shown in Table I, the utilized network architectures for sound source localization include fully connected (FC) neural network [14], [17]- [19], [28]- [30], CNN [20], [21], [23], recurrent neural network (RNN) [27] and CRNN [22], [25], [26]. As for the network input, the spatial features used for training include inter-channel difference features such as IPD [14], [28]- [30], IID [14], [19], [28], eigenvectors of spatial correlation matrix [18], intensity vector [22], GCC [17], [19], [27], and the spatial spectra such as spatial pseudospectrum [23]. The magnitude spectrum [21], [27], [30] can be also fed together with the spatial features to the network, but cannot be used solely.…”