In this paper, we propose a fully automated learning-based approach for detecting cells in time-lapse phase contrast images. The proposed system combines 2 machine learning approaches to achieve bottom-up image segmentation. We apply pixel-wise classification using random forests (RF) classifiers to determine the potential location of the cells. Each pixel is classified into 4 categories (cell, mitotic cell, halo effect, and background noise). Various image features are extracted at different scales to train the RF classifier. The resulting probability map is partitioned using the k-means algorithm to form potential cell regions. These regions are expanded into the neighboring areas to recover some missing or broken cell regions. To validate the cell regions, another machine learning method based on the bag-of-features and spatial pyramid encoding is proposed. The result of the second classifier can be a validated cell, a merged cell, or a noncell. In the case that the cell region is classified as a merged cell, it is split by using the seeded watershed method. The proposed method is demonstrated on several phase contrast image datasets, ie, U2OS, HeLa, and NIH 3T3. In comparison to state-of-the-art cell detection techniques, the proposed method shows improved performance, particularly in dealing with noise interference and drastic shape variations.