We propose automatic synthesis of application specific instruction set processors (ASIPs). We use pipeline execution of multi-op machine-instructions, e.g., * (reg1 * reg2) = ( * reg3)+ ( * reg4) (C-syntax) an instruction with three memory pipeline stages and two arithmetic stages. The problem is, for a given set of loops, to find a pipeline configuration and a multi-op ISA that maximizes the IPC (instructions per cycle) while minimizing the resource usage and the cost of interconnections to the registerfile of the resulting CPU. The algorithm is based on finding an efficient cover of a large graph by a small set of convex subgraphs gis that are consistent with a given set of pipeline units. Unlike previous works, gis are not synthesized to circuits that are executed in a co-processor mode but rather both gis and the rest of the program are executed by the same set of multiop pipeline units. In this way we eliminate the overhead associated with the co-processor mode of regular ASIPs but maintain high values of IPC of these ASIPs. Once the pipeline configuration and the cover g1 ∪ . . . ∪ gn = G has been computed the Verilog RTL of the corresponding CPU (extended with branch instructions) is generated and synthesized to FPGA. The results show that, for a set of selected kernels, the resulting ASIP (called Ocpu) obtains higher IPC values compare to an equivalent compilation to an ARM cpu while obtaining similar clock frequencies.