Conventional stereoscopic displays are subject to the well-known vergence-accommodation conflict (VAC) problem due to their lack of the ability to render correct focus cues of a 3D scene. A computational multilayer light field display has been explored as one of the approaches that can potentially overcome the VAC problem owing to the promise of rendering a true 3D scene by sampling the directions of the light rays apparently emitted by the 3D scene. Several pioneering works have demonstrated working prototypes of multilayer light field displays and the potential capability of rendering nearly correct focus cues. However, there is no systematic investigation upon methods for modeling and analyzing such a display, which is essential for further optimization and development of high-performance multilayer light field display systems. In this paper, we proposed a systemic analysis method for the multilayer light field displays by simulating the perceived retinal image which takes the display factors, the view-dependency of the reference light field, the diffraction effect, and the visual factors into consideration. Then we applied this model to investigate the accommodative response when observing the display engine.