Metallographic images or often called the microstructures contain important information about metals, such as strength, toughness, ductility, corrosion resistance, which are used to choose the proper materials for various engineering applications. Thus by understanding the microstructures, one can determine the behaviour of a component made of a particular metal, and can predict the failure of that component in certain conditions. Image segmentation is a powerful technique for determination of morphological features of the microstructure like volume fraction, inclusion morphology, void, and crystal orientations. These are some key factors for determining the physical properties of metal. Therefore, automatic micro-structure characterization using image processing is useful for industrial applications which currently adopts deep learning-based segmentation models. In this paper, we propose a metallographic image segmentation method using an ensemble of modified U-Nets. Three U-Net models having the same architecture are separately fed with color transformed imaged (RGB, HSV and YUV). We improvise the U-Net with dilated convolutions and attention mechanisms to get finer grained features. Then we apply the sum-rule-based ensemble method on the outcomes of U-Net models to get the final prediction mask. We achieve the mean intersection over union (IoU) score of 0.677 on a publicly available standard dataset, namely MetalDAM. We also show that the proposed method obtains results comparable to state-of-the-art methods with fewer number of model parameters. The source code of the proposed work can be found at https://github.com/mb16biswas/attention-unet.