The frame rate of the digital high-speed video camera was 2000 frames per second (fps) in 1989, and has been exponentially increasing. A simulation study showed that a silicon image sensor made with a 130 nm process technology can achieve about 1010 fps. The frame rate seems to approach the upper bound. Rayleigh proposed an expression on the theoretical spatial resolution limit when the resolution of lenses approached the limit. In this paper, the temporal resolution limit of silicon image sensors was theoretically analyzed. It is revealed that the limit is mainly governed by mixing of charges with different travel times caused by the distribution of penetration depth of light. The derived expression of the limit is extremely simple, yet accurate. For example, the limit for green light of 550 nm incident to silicon image sensors at 300 K is 11.1 picoseconds. Therefore, the theoretical highest frame rate is 90.1 Gfps (about 1011 fps).