Grasping and releasing objects would cause oscillations to delivery drones in the warehouse. To reduce such undesired oscillations, this paper treats the to-be-delivered object as an unknown external disturbance and presents an imagebased disturbance observer (DOB) to estimate and reject such disturbance. Different from the existing DOB technique that can only compensate for the disturbance after the oscillations happen, the proposed image-based one incorporates imagebased disturbance prediction into the control loop to further improve the performance of the DOB. The proposed imagebased DOB consists of two parts. The first one is deep-learningbased disturbance prediction. By taking an image of the tobe-delivered object, a sequential disturbance signal is predicted in advance using a connected pre-trained convolutional neural network (CNN) and a long short-term memory (LSTM) network. The second part is a conventional DOB in the feedback loop with a feedforward correction, which utilizes the deep learning prediction to generate a learning signal. Numerical studies are performed to validate the proposed image-based DOB regarding oscillation reduction for delivery drones during the grasping and releasing periods of the objects.