We consider the flare prediction problem that distinguishes flare-imminent active regions that produce an M- or X-class flare in the succeeding 24 hr, from quiet active regions that do not produce any flares within ±24 hr. Using line-of-sight magnetograms and parameters of active regions in two data products covering Solar Cycles 23 and 24, we train and evaluate two deep learning algorithms—a convolutional neural network (CNN) and a long short-term memory (LSTM)—and their stacking ensembles. The decisions of CNN are explained using visual attribution methods. We have the following three main findings. (1) LSTM trained on data from two solar cycles achieves significantly higher true skill scores (TSSs) than that trained on data from a single solar cycle with a confidence level of at least 0.95. (2) On data from Solar Cycle 23, a stacking ensemble that combines predictions from LSTM and CNN using the TSS criterion achieves a significantly higher TSS than the “select-best” strategy with a confidence level of at least 0.95. (3) A visual attribution method called “integrated gradients” is able to attribute the CNN’s predictions of flares to the emerging magnetic flux in the active region. It also reveals a limitation of CNNs as flare prediction methods using line-of-sight magnetograms: it treats the polarity artifact of line-of-sight magnetograms as positive evidence of flares.