This paper presents an efficient cooperative interaction between multicore (CPU) and manycore (GPU) resources in the design of a high-performance video encoder. The proposed technique, applied to the well-established and highly optimized VP8 encoding format, can achieve a significant speed-up with respect to the mostly optimized software encoder (up to ×6), with minimum degradation of the visual quality and low processing latency. This result has been obtained through a highly optimized CPU-GPU interaction, the exploitation of specific GPU features, and a modified search algorithm specifically adapted to the GPU execution model. Several experimental results are reported and discussed, confirming the effectiveness of the proposed technique. The presented approach, though implemented for the VP8 standard, is of general interest, as it could be applied to any other video encoding scheme.