In this paper, a pseudo object-oriented video coding system is proposed and implemented. In order to increase the coding efficiency, cache VQ algorithm is suggested to further compress those areas where motion estimation fails. According to our primary simulation results, the visual quality of long-timed sequences is still acceptable even for bit-rates below 10 Kbps. In addition to the high compression ratio for very low bit-rates, content-based applications are also expected since the proposed system utilizes segmented motion field; furthermore, the occurrences of prediction errors generally locate at emotionally important parts, e.g. eyes and mouth, etc. All the coded components are not only useful for compression but also meaningful for video recognition.