Implicit staggered‐grid finite‐difference methods are attractive for elastic wave modelling due to significantly enhanced spatial accuracy compared to explicit ones. However, the central‐grid finite‐difference operators used to approximate the temporal derivatives result in a limited accuracy in time. Temporal high‐order finite‐difference methods have the ability to weaken the temporal dispersion and improve the modelling stability. It is noted that the previous temporal high‐order and spatial implicit finite‐difference methods are all designed in the space domain for performing acoustic wave propagation. To implement 2‐D elastic wave simulation with high‐order accuracy both in space and time, we propose two time–space domain implicit staggered‐grid finite‐difference schemes, in which the spatial derivatives are approximated by the weighted average of a few extra off‐axial nodes and axial nodes of the conventional cross‐stencil. We derive the P‐ and S‐wave dispersion relations of the whole elastic wave equation and estimate the finite‐difference coefficients via a variable substitution‐based Taylor‐series expansion. Our Taylor‐series expansion‐based new scheme yields high‐order temporal and spatial accuracy. Besides, the spatial accuracy can be further enhanced by our newly proposed linear optimization strategy, which benefits from easy implementation since we only optimize the axial spatial coefficients via a least‐squares strategy and set the off‐axial temporal coefficients the same as the solution of the Taylor‐series expansion method. Besides, the P‐ and S‐wave separation approach is adopted to propagate the P‐ and S‐wavefields with the P‐ and S‐wave dispersion relation‐based finite‐difference operators, respectively. Our two new schemes are more capable of suppressing the numerical dispersion and exhibit better stability performance compared to conventional one, as we will illustrate via a detailed analysis of dispersion, stability and numerical experiments. In addition, a comparison of computation times demonstrates the efficiency advantage of two new schemes since small operator lengths and large time steps are allowed.