In this paper, we show the effectiveness of pipeline implementations of Dynamic Pro- gramming (DP) on Graphics Processing Unit (GPU). We deal with a simplified DP problem where each element of its solution table is calculated in order by semi-group operations among several of already computed elements in the table. We implement the DP program on GPU in a pipeline fashion, i.e., we use GPU cores for supporting pipeline-stages so that several elements of the solution tables are partially computed at one time. Further, to accelerate the pipeline implementation, we propose a p-fold pipeline technique, which enables larger parallelism more than the number of pipeline-stages.