This paper presents a parallel ray-tracing algorithm in order to compute very large models (more than 100 million triangles) with distributed computer architecture. On a single computer, the size of the used dataset generates an out of core computation. Cluster architectures designed with off-the-shelf components offer extended capacities which allow to keep the large dataset inside the aggregated main memories. Then, to achieve scalability of operational applications, the real challenge is to exploit efficiently the amount of available memory and computing power. Ray-tracing, for high quality image rendering, spawns non-coherent rays which generate irregular tasks difficult to distribute on such architectures. In this paper we present a cache mechanism for main memory management distributed on each parallel computer and we implement a load balancing solution based on an auto adaptive algorithm to distribute the computation efficiently.