Deep networks have been recently proposed to estimate motor intention using conventional bipolar surface electromyography (sEMG) signals for myoelectric control of neurorobots. In this regard, deepnets are generally challenged by long training times (affecting the practicality and calibration), complex model architectures (affecting the predictability of the outcomes), a large number of trainable parameters (increasing the need for big data), and possibly overfitting. Capitalizing on our recent work on homogeneous temporal dilation in a Recurrent Neural Network (RNN) model, this paper proposes, for the first time, heterogeneous temporal dilation in an LSTM model and applies that to high-density surface electromyography (HD-sEMG), allowing for decoding dynamic temporal dependencies with tunable temporal foci. In this paper, a 128-channel HD-sEMG signal space is considered due to the potential for enhancing the spatiotemporal resolution of human-robot interfaces. Accordingly, this paper addresses a challenging motor intention decoding problem of neurorobots, namely, transient intention identification. The aforementioned problem only takes into account the dynamic and transient phase of gesture movements when the signals are not stabilized or plateaued, addressing which can significantly enhance the temporal resolution of human-robot interfaces. This would eventually enhance seamless real-time implementations. Additionally, this paper introduces the concept of “dilation foci” to modulate the modeling of temporal variation in transient phases. In this work a high number (i.e. 65) of gestures is included, which adds to the complexity and significance of the understudied problem. Our results show state-of-the-art performance for gesture prediction in terms of accuracy, training time, and model convergence.