The hippocampus is a crucial brain structure involved in memory formation, spatial navigation, emotional regulation, and learning. An accurate MRI image segmentation of the human hippocampus plays an important role in multiple neuro-imaging research and clinical practice, such as diagnosing neurological diseases and guiding surgical interventions. While most hippocampus segmentation studies focus on using T1-weighted or T2-weighted MRI scans, we explore the use of diffusion-weighted MRI (dMRI), which offers unique insights into the microstructural properties of the hippocampus. Particularly, we utilize various anisotropy measures derived from diffusion MRI (dMRI), including fractional anisotropy, mean diffusivity, axial diffusivity, and radial diffusivity, for a multi-contrast deep learning approach to hippocampus segmentation. To exploit the unique benefits offered by various contrasts in dMRI images for accurate hippocampus segmentation, we introduce an innovative multimodal deep learning architecture integrating cross-attention mechanisms. Our proposed framework comprises a multi-head encoder designed to transform each contrast of dMRI images into distinct latent spaces, generating separate image feature maps. Subsequently, we employ a gated cross-attention unit following the encoder, which facilitates the creation of attention maps between every pair of image contrasts. These attention maps serve to enrich the feature maps, thereby enhancing their effectiveness for the segmentation task. In the final stage, a decoder is employed to produce segmentation predictions utilizing the attention-enhanced feature maps. The experimental outcomes demonstrate the efficacy of our framework in hippocampus segmentation and highlight the benefits of using multi-contrast images over single-contrast images in diffusion MRI image segmentation.