BackgroundAldehyde dehydrogenases (ALDHs) represent a group of enzymes that detoxify aldehydes by facilitating their oxidation to carboxylic acids, and have been shown to play roles in plant response to abiotic stresses. However, the comprehensive analysis of ALDH superfamily in soybean (Glycine max) has been limited.ResultsIn present study, a total of 53 GmALDHs were identified in soybean, and grouped into 10 ALDH families according to the ALDH Gene Nomenclature Committee and phylogenetic analysis. These groupings were supported by their gene structures and conserved motifs. Soybean ALDH superfamily expanded mainly by whole genome duplication/segmental duplications. Gene network analysis identified 1146 putative co-functional genes of 51 GmALDHs. Gene Ontology (GO) enrichment analysis suggested the co-functional genes of these 51 GmALDHs were enriched (FDR < 1e-3) in the process of lipid metabolism, photosynthesis, proline catabolism, and small molecule catabolism. In addition, 22 co-functional genes of GmALDHs are related to plant response to water deprivation/water transport. GmALDHs exhibited various expression patterns in different soybean tissues. The expression levels of 13 GmALDHs were significantly up-regulated and 14 down-regulated in response to water deficit. The occurrence frequencies of three drought-responsive cis-elements (ABRE, CRT/DRE, and GTGCnTGC/G) were compared in GmALDH genes that were up-, down-, or non-regulated by water deficit. Higher frequency of these three cis-elements was observed for the group of up-regulated GmALDH genes as compared to the group of down- or non- regulated GmALDHs by drought stress, implying their potential roles in the regulation of soybean response to drought stress.ConclusionsA total of 53 ALDH genes were identified in soybean genome and their phylogenetic relationships and duplication patterns were analyzed. The potential functions of GmALDHs were predicted by analyses of their co-functional gene networks, gene expression profiles, and cis-regulatory elements. Three GmALDH genes, including GmALDH3H2, GmALDH12A2 and GmALDH18B3, were highly induced by drought stress in soybean leaves. Our study provides a foundation for future investigations of GmALDH gene function in soybean.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-017-3908-y) contains supplementary material, which is available to authorized users.