Abstract-Many embedded systems process huge amount of data that comes from the interaction with the environment. The Graphics Processing Unit (GPU) is a modern embedded solution that tackles the efficiency challenge when processing a lot of data. GPU may improve even more the system performance by allowing multiple activities to be executed in a parallel manner. In a complex component-based application, the challenge is to decide the components to be executed in parallel on GPU when considering different system factors (e.g., GPU memory, GPU computation power).In the context of component-based CPU-GPU embedded systems, we propose an automatic method that provides parallel execution schemes of components with GPU capabilities. The introduced method considers hardware (e.g., available GPU memory), software properties (e.g., required GPU memory) and communication pattern. Moreover, the method optimizes the overall system performance based on component execution times and system architecture (i.e., communication pattern). The validation uses an underwater robot example to describe the feasibility of our proposed method.