Modern datacenters provide a wide variety of application services, which generate a mix of delay-sensitive short flows and throughput-oriented long flows, transmitting in the multi-path datacenter network. Though the existing load balancing designs successfully make full use of available parallel paths and attain high bisection network bandwidth, they reroute flows regardless of their dissimilar performance requirements. The short flows suffer from the problems of large queuing delay and packet reordering, while the long flows fail to obtain high throughput due to low link utilization and packet reordering. To address these inefficiency, we design a fine-grained load balancing scheme, namely TR (Traffic-aware Rerouting), which identifies flow types and executes flexible and traffic-aware rerouting to balance the performances of both short and long flows. Besides, to avoid packet reordering, TR leverages the reverse ACKs to estimate the switch-to-switch delay, thus excluding paths that potentially cause packet reordering. Moreover, TR is only deployed on the switch without any modification on end-hosts. The experimental results of large-scale NS2 simulations show that TR reduces the average and tail flow completion time for short flows by up to 60% and 80%, as well as provides up to 3.02x gain in throughput of long flows compared to the state-of-the-art load balancing schemes.