The short time to market cycle and the target to reduce design and verification costs are driving forces to design programmable implementations of the video processing algorithms. We present two processor architectures the first one representing an application-specific instruction set processor (ASIP) design, whereas the second architecture represents a domain-specific instruction-set processor (DSIP) architecture with more general purpose instruction-set. In this work, we present results for H264 and VP8 in-loop deblocking algorithms. The processors are based on the transport triggered architecture which provides scalable instruction-level parallelism and, thanks to its simple structure, lend itself to cost effective designs. Both of the designs are programmed with C language with a minimal additional parallelism markup. The designs fulfill realtime requirements for filtering macroblocks in high-definition video. The first architecture, based on special function units, filters a high-definition stream (1920 × 1080) at 75 fps, whereas the second architecture, which provides a better programmability, filters the stream at 53 fps. The processors run on 200 MHz clock frequency and the areas vary from 146k to 373k gate equivalents depending on the processor architecture.