Short videos are popular information carriers on the Internet, and detecting events from them can well benefit widespread applications, e.g., video browsing, management, retrieval and recommendation. Existing video analysis methods always require decoding all frames of videos in advance, which is very costly in time and computation power. These short videos are often untrimmed, noisy and even incomplete, adding much difficulty to event analysis. Unlike previous works focusing on actions, we target short video event detection and propose Recurrent Compressed Convolutional Networks (RCCN) for discovering the underlying event patterns within short videos possibly including a large proportion of non-event videos. Instead of using the whole videos, RCCN performs representation learning at much lower cost within the compressed domain where the encoded motion information reflecting the spatial relations among frames can be easily obtained to capture dynamic tendency of event videos. This alleviates the information incompleteness problem that frequently emerges in user-generated short videos. In particular, RCCN leverages convolutional networks as the backbone and the Long Short-Term Memory components to model the variable-range temporal dependency among untrimmed video frames. RCCN not only learns the common representation shared by the short videos of the same event, but also obtains the discriminative ability to detect dissimilar videos. We benchmark the model performance on a set of short videos generated from publicly available event detection database YLIMED, and compare RCCN with several baselines and state-of-the-art alternatives. Empirical studies have verified the preferable performance of RCCN. INDEX TERMS Compressed domain, event analysis, recurrent neural networks, short video event detection, temporal dependency.