In this work, the image formation in a confocal laser scanning microscope (CLSM) is investigated for custom-made multi-cylinder phantoms. The cylinder structures were fabricated using 3D direct laser writing and consist of parallel cylinders with radii of 5 and 10 μm for the respective multi-cylinder phantom, with overall dimensions of about 200×200×200 μm3. Measurements were performed for different refractive index differences and by varying other parameters of the measurement system, such as pinhole size or numerical aperture (NA). For theoretical comparison, the confocal setup was implemented in an in-house developed tetrahedron-based and GPU-accelerated Monte Carlo (MC) software. The simulation results for a cylindrical single scatterer were first compared with the analytical solution of Maxwell’s equations in two dimensions for prior validation. Subsequently, the more complex multi-cylinder structures were simulated using the MC software and compared with the experimental results. For the largest refractive index difference, i.e., air as the surrounding medium, the simulated and measured data show a high degree of agreement, with all the key features of the CLSM image being reproduced by the simulation. Even with a significant reduction in the refractive index difference by the use of immersion oil to values as low as 0.005, a good agreement between simulation and measurement was observed, particularly with respect to the increase in penetration depth.