We propose a model for speech recognition that consists of multiple semi-synchronized recognizers operating on a polyphase decomposition of standard speech features. Specifically, we consider multiple out-of-phase downsampled speech features as separate streams which are modeled separately at the lowest level, and are then integrated at the higher level (words) during first-pass decoding. Our model lessens the severity of the oversampling problem in many speech recognition systems -i.e., that speech modulation energy is most important below 25Hz but a 100Hz frame rate gives a modulation bandwidth of 50Hz. Our polyphase approach moreover captures wider and more diverse dynamics within the speech signal. Our integrative network is high-level, namely it couples together and decodes word strings from different recognizers simultaneously and asynchronously. We provide preliminary results on the 10-word vocabulary version of the SVitchboard (small-vocabulary switchboard) task and show that our polyphase recognition system significantly outperforms an optimized baseline (HMM) approach.