The deep learning-based object detector accuracy has surpassed conventional detection methods. Although implementation is still limited to hardware capabilities, this problem can be overcome by combining edge devices with cloud computing. The recent study of cloud-based object detector architecture is generally based on representational state transfer (RESTful web services), which uses a pooling system method for data exchange. As a result, this system leads to a low detection speed and cannot support real-time data streaming. Therefore, this study aims to enhance the detection speed in cloud-based object recognition systems using gRPC and Protobuf to support real-time detection. The proposed architecture was deployed on the Virtual Machine Instance (VMI) equipped with a Graphics Processing Unit (GPU). The gRPC server and YOLOv3 deep learning object detector were executed on the cloud server to handle detection requests from edge devices. Furthermore, the captured images from the edge devices were encoded into Protobuf format to reduce the message size delivered to the cloud server. The results showed that the proposed architecture improved detection speed performance on the client-side in the range of 0.27 FPS to 1.72 FPS compared to the state-of-the-art method. It was also observed that it could support multiple edge devices connection with slight performance degradation in the range of 1.78 FPS to 1.83 FPS, depending on the network interface used.