In this article, we systematically analyze a deep neural networks-based image caption generation method. Image Captioning aims to automatically generate a sentence description for an image. Our article model will take an image as input and generate on English sentence as output, describing the contents of the image. It has attracted much research attention in cognitive computing in the recent years. The task is rather complex, as the concepts of both computer vision and natural language processing domains are combined together. We have developed a model using the concepts of a Convolutional Neural Network (CNN) and long Short-Term Memory (LSTM) model and build a working model of Image caption generator by implementing CNN and LSTM. After the caption generation phase, we use BLEU Scores to evaluate the efficiency of our model. Thus, our system helps the user to get descriptive caption for the given input image.