In applications like VoIP, speech codecs have to deal with excessive packet losses, caused by network errors and/or delays. In this paper a new method for the reconstruction of lost speech spectral envelopes is presented, which is based on a statistical estimation function. We suggest the usage of a minimal "corrective" bitstream and propose Coding with Side Information (CSI) techniques for an efficient Forward Error Correction (FEC) strategy. The proposed methods are tested on multiple scenarios of missing frames. Objective results indicate that with only 4 bits per lost frame, a spectral distortion reduction of 0.77-1.14 dB is achieved, compared to results obtained by current state-of-the-art estimation methods. Compared to "predictive" estimation methods, the use of jitter buffer as side information and 4 bits per lost frame provide a 42% reduction of spectral distortion for single packet losses, and a 32% reduction for double packet losses. Subjective results indicate that the corrected speech has fewer artifacts.