The heterogeneous nature of Graphics processor unit (GPU) -CPU makes it a candidate for coming exascale systems.The cores of GPGPU-which is a cost-effective computing platform-are characterized by long periods of inactive times, which results in the underutilization of the hardware resources. This is due to several factors like the limitation of on-chip memory and register files, the inefficient scheduling mechanisms, and communication bottlenecks GPU -CPU communication. In order to counteract the underutilization of recourses, certain techniques have been proposed. In this research, many architectural and system-level techniques aiming to manage and fully leverage GPU resources are surveyed, compared and evaluated. Also, the significance and challenges of warp scheduler in GPUs are thoroughly discussed. The main purpose of this paper is to provide researchers an insight into warp scheduler techniques for GPUs, as well as motivate them to present more efficient methods for enhance performance via improve thread scheduler in future GPUs.