The Transmission Control Protocol (TCP) plays a critical role in the Internet as it is the protocol used for data transport by most Internet services and applications. With rapid advances in broadband Internet and mobile/wireless networks, current TCPs are increasingly becoming the bottleneck. This work tackles this challenge by developing a novel TCP design called Fast Launch with Agile congeStion Handling (FLASH) that not only achieves improved performance for long TCP flows but also significantly raises the performance of short to medium TCP flows that are far more common in the Internet. We evaluated its longterm and short-term performance over a wide range of network environments, using two emulation platforms (Pantheon and DummyNet) as well as Internet experiments. Compared to two of the leading TCP designs deployed in the Internet, i.e., Cubic and BBR, FLASH consistently achieved higher long-term and short-term bandwidth efficiency. For example, in trace-driven emulated experiments using Poisson traffic with a mean flow size of 1 MB operating at medium link utilization of 27%, FLASH can reduce the flow completion time (FCT) by 36% (vs. Cubic) and 26% (vs. BBR), with mean packet queueing delay of 11.7 ms compared to 3.4 ms (Cubic) and 8.8 ms (BBR). It also maintained good fairness with itself and is competitive against Cubic and BBR sharing the same bottleneck. In addition, FLASH has been tested in two real-world Internet environments. In the cloud-to-cloud experiment, it reduced FCT by 52.9% (vs. Cubic) and 46.6% (vs. BBR), while in the cloud-to-client experiment, it reduced FCT by 31.3% (vs. Cubic) and 12.7% (vs. BBR). FLASH is entirely sender-based and is compatible with current TCP receivers, thereby readily deployable in current Internet servers.