Developing an efficient and highly scalable ray tracer for the Metropolis light transport algorithm is becoming increasingly important as the request for photorealistic images becomes a common trend. Although the Metropolis light trans port algorithm has produced some of the most realistic images to date, it usually takes a great amount of time to render an image. The development of an efficient and highly scalable ray tracer for the Metropolis light transport algorithm is hard due in large part to the irregular memory access patterns, the imbalanced workload of light-carrying paths and the complicated mathematical model and complex physical processes.In this paper, we present a highly scalable physically based parallel ray tracer for the Metropolis light transport algorithm.Firstly, we present the idea of snapshot and sub-snapshot, then propose a novel assignment partitioning algorithm for compute nodes and CPU cores since the demand-driven assignment partitioning algorithms don't work. Secondly, we propose a physically based parallel ray racing framework for the Metropolis light transport algorithm, which is based on a master-worker architecture. Finally, we discuss the issue of granularity of the assignment partitioning and some optimization strategies for improving overall performance, then a hybrid scheduling strategy combining a static and dynamic scheduling strategy is described.Experiments show that our physically based ray tracer almost reaches linear speedup by using 26,400 CPU cores on the Tianhe-2 supercomputer. Our ray tracer is more efficient and highly scalable.Keywords-physically based ray tracing, bidirectional path trac ing, Metropolis light transport algorithm, assignment partition ing, hybrid scheduling, distributed computing.