Speech is one of the most delicate medium through which gender of the speakers can easily be identified. Though the related research has shown very good progress in machine learning but recently, deep learning has imparted a very good research area to explore the deficiency of gender discrimination using traditional machine learning techniques. In deep learning techniques, the speech features are automatically generated by the reinforcement learning from the raw data which have more discriminating power than the human generated features. But in some practical situations like gender recognition, it is observed that combination of both types of features sometimes provides comparatively better performance. In the proposed work, we have initially extracted and selected some informative and precise acoustic features relevant to gender recognition using entropy based information theory and Rough Set Theory (RST). Next, the audio speech signals are directly fed into the deep neural network model consists of Convolution Neural Network (CNN) and Gated Recurrent Unit network (GRUN) for extracting features useful for gender recognition. The RST selects precise and informative features, CNN extracts the locally encoded important features, and GRUN reduces the vanishing gradient and exploding gradient problems. Finally, a hybrid gender recognition system is developed combining both generated feature vectors. The developed model has been tested with five bench mark and a simulated dataset to evaluate its performance and it is observed that combined feature vector provides more effective gender recognition system specially when transgender is considered as a gender type together with male and female.