Visual evoked potentials (VEPs) are electrical signals measured from the scalp in response to rapid and repetitive visual stimuli. The windowed adaptive chirplet transform (ACT) has been proposed recently to provide a unified and compact representation of VEPs from its transient portion to the steady-state portion. An important question concerns proper selection of window length. In this paper we show that the lower bound of the length is limited by the signal-to-noise ratio (SNR), while the upper bound is placed by the duration of the transient portion of VEPs. For our data, we have proposed an optimal length of 0.416 s (100 points). It is optimal in that under the condition of efficient estimators, the time-resolution of chirplet analysis is maintained as high as possible.