With the advent of robot-assisted surgery, user-friendly technologies have been applied to the da Vinci surgical system (dVSS), and their efficacy has been validated in worldwide surgical fields. However, further improvements are required to the traditional manipulation methods, which cannot control an endoscope and surgical instruments simultaneously. This study proposes a speech recognition control interface (SRCI) for controlling the endoscope via speech commands while manipulating surgical instruments to replace the traditional method. The usability-focused comparisons of the newly proposed SRCI-based and the traditional manipulation method were conducted based on ISO 9241-11. 20 surgeons and 18 novices evaluated both manipulation methods through the line tracking task (LTT) and sea spike pod task (SSPT). After the tasks, they responded to the globally reliable questionnaires: after-scenario questionnaire (ASQ), system usability scale (SUS), and NASA task load index (TLX). The completion times in the LTT and SSPT using the proposed method were 44.72% and 26.59% respectively less than the traditional method, which shows statistically significant differences (p < 0.001). The overall results of ASQ, SUS, and NASA TLX were positive for the proposed method, especially substantial reductions in the workloads such as physical demands and efforts (p < 0.05). The proposed speech-mediated method can be a candidate suitable for the simultaneous manipulation of an endoscope and surgical instruments in dVSS-used robotic surgery. Therefore, it can replace the traditional method when controlling the endoscope while manipulating the surgical instruments, which contributes to enabling the continuous surgical flow in operations consequentially.