SUMMARY This paper proposes a convolution with systolic array sturucture for perspective projection in real-time volume graphics based on the shear-warp method. In the original method, the further the ray proceeds, the more voxels are required for the calculation of convolution. The increase of required voxels makes dicult to implement the method in VLSI oriented architecture. 1) We use several sets of resolution of voxels associated with depth, in order that convolution can be done with constant number of voxels independent of depth. 2) We implement 3D convolution by three serial 1D convolutions along X, Y and Z axes, which reduces the calculation steps from M 3 to 3M, where the convolution is calculated for M 3 area. For V 3 voxels dataset, the number of pipelines for rays is V 2 and their pipeline stage is 3M. If the hardware of a single pipeline has the capability of calculating V rays, each of the implemented pipelines is assigned to V theoretical pipelines (for V 2 rays). In actual implementation, a number of hardware pipelines should be much smaller than the V theoretical pipelines. We fold the theoretical pipelines and reduce them to the certain number of hardware pipelines. Regarding this folding, we show the relation between folding process and its necessary time delay. The architecture can generate image of 256 3 voxles dataset( V = 256 ) at 30Hz with 4 pipelines. In addition, the architecture can be extended easily for 512 3 (V = 512) and 1024 3 (V = 1024) dataset with 32 pipelines and 256 pipelines respectively.