Explosive detection in dual energy X-ray systems is a difficult problem owing to the fact that we don't have enough information to estimate the effective atomic number and density of a material. Though there are several approximations available in the literature, building a solution with an acceptable true positive and false positive rate is not trivial. In this work we exploit the learning capability of a multimodal neural network for achieving a high detection rate and an acceptable false positive rate. We also show that, using a guided filter based fusion for fusing the high and low energy images leads to fused images that have a high mutual information w.r.t. the high and low images, than the existing solutions. This fused image is one of the inputs to the neural network, the other being a material dependent image that we create from the high and low energy images. The proposed solution has a high recall.