Memory organization for video algorithms on programmable signal processors

Greef, Eddy De; Catthoor, Francky; Man, H. De

doi:10.1109/iccd.1995.528922

Cited by 10 publications

(3 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…So we can analyze the data flow to help cache find out the best mapping position, or manage cache by ourselves. Compile-time data caching decisions have a large effect on the performance [8].…”

Section: Memory Organizationmentioning

confidence: 99%

Performance-Complexity Analysis of High Resolution Video Encoder and its Memory Organization for DSP Implementation

Yang

Gao

Liu

2006

2006 IEEE International Conference on Multimedia and Expo

View full text Add to dashboard Cite

This paper first analyses the relationship between performance and complexity of several state-of-the-art coding algorithms for high resolution videos. Based on the coding efficiency comparison under different config parameters and the intra mode usage in P/B frame, this paper presents a practical scheme to improve the coding speed with slight quality loss. And a DSP-oriented two-level internal memory organization is also proposed to keep pipeline processing. In such organization, block correlation caused by motion vector predictions is lightened while keeping almost the same performance as the original 1 .

show abstract

Section: Memory Organizationmentioning

confidence: 99%

Performance-Complexity Analysis of High Resolution Video Encoder and its Memory Organization for DSP Implementation

Yang

Gao

Liu

2006

2006 IEEE International Conference on Multimedia and Expo

View full text Add to dashboard Cite

show abstract

“…Allocation of data to memory can, in principle, be combined with hardware-software partitioning and process scheduling. 35 Memory optimization, however, drastically increases the number of design parameters beyond the assignment of data variables to memories and accesses to memory ports. Multidimensional data arrays can be rearranged in memory by index transformations to improve memory use or simplify array index generation.…”

Section: Design Space Explorationmentioning

confidence: 99%

Codesign of embedded systems: status and trends

Ernst

1998

IEEE Des. Test. Comput.

146

View full text Add to dashboard Cite

“…Li et al [1995] present a technique for estimation of instruction cache performance. De Greef et al [1995] have studied the effect of cache parameters and memory organization strategies on video and imaging applications. Rawat [1993] and Austin [1996] have addressed the problem of variable placement for improving cache performance.…”

Section: Related Workmentioning

confidence: 99%

Memory data organization for improved cache performance in embedded processor applications

Panda

Dutt

Nicolau

1997

ACM Trans. Des. Autom. Electron. Syst.

View full text Add to dashboard Cite

Code generation for embedded processors opens up the possibility for several performance optimization techniques that have been ignored by traditional compilers due to compilation time constraints. We present techniques that take into account the parameters of the data caches for organizing scalar and array variables declared in embedded code into memory, with the objective of improving data cache performance. We present techniques for clustering variables to minimize compulsory cache misses, and for solving the memory assignment problem to minimize conflict cache misses. Our experiments with benchmark code kernels from DSP and other domains on the CW4001 embedded processor from LSI Logic indicate significant improvements in data cache performance by the application of our memory organization technique. the performance improvement of applications running on general-purpose embedded processors. In a general-purpose embedded processor, the architecture more closely resembles traditional processors, with the following well-known exceptions: (1) we now frequently have only a single application running on the processor, and (2) we are permitted longer analysis and compilation times for the application. These features raise many interesting problems that are unique to the embedded processor environment, and that have not been addressed by traditional compilers (or have been addressed only partially), largely due to restrictions on compilation times permitted.Generation of efficient code for embedded processors has been the subject of recent investigation [Goosens et al. 1990;Paulin et al. 1995;Araujo et al. 1995]. Optimization techniques that improve the performance of application programs by exploiting the irregular architectures of some embedded DSP processors and other application-specific processors have been reported Sudarsanam and Malik 1995;Goosens et al. 1990;Liem et al. 1994]. Research efforts have also focused on retargetable code generation, with an attempt to generate code from the same behavioral specification, into different target embedded processors, using a suitable processor model [Lanneer et al. 1995;Schenk 1995].An important determinant of performance in embedded systems is the interaction between the processor and external memory. Embedded processors such as the CW4001 are equipped with on-chip instruction and data caches, which interface with larger off-chip memories. Since off-chip memory accesses usually stall the CPU execution for significant durations (each access could take 10 -20 processor cycles, depending on the relative processor and memory access speeds), it is important to design the interface between cache and main memory carefully [Patterson and Hennessy 1994]. Several architectural and compiler optimizations have been reported in the past that ensure spatial and temporal locality of programs so as to improve instruction and data caches.Cache misses can be classified into several categories:(1) compulsory misses-caused when a memory word is accessed for the first time;(2) capacity mis...

show abstract

Memory organization for video algorithms on programmable signal processors

Cited by 10 publications

References 24 publications

Performance-Complexity Analysis of High Resolution Video Encoder and its Memory Organization for DSP Implementation

Performance-Complexity Analysis of High Resolution Video Encoder and its Memory Organization for DSP Implementation

Codesign of embedded systems: status and trends

Memory data organization for improved cache performance in embedded processor applications

Contact Info

Product

Resources

About