Bone age assessment (BAA) from hand radiographs is crucial for diagnosing endocrinology disorders in adolescents and supplying therapeutic investigation. In practice, due to the conventional clinical assessment being a subjective estimation, the accuracy of BAA relies highly on the pediatrician's professionalism and experience. Recently, many deep learning methods have been proposed for the automatic estimation of bone age and had good results. However, these methods do not exploit sufficient discriminative information or require additional manual annotations of critical bone regions that are important biological identifiers in skeletal maturity, which may restrict the clinical application of these approaches. In this research, we propose a novel two-stage deep learning method for BAA without any manual region annotation, which consists of a cascaded critical bone region extraction network and a gender-assisted bone age estimation network. First, the cascaded critical bone region extraction network automatically and sequentially locates two discriminative bone regions via the visual heat maps. Second, in order to obtain an accurate BAA, the extracted critical bone regions are fed into the gender-assisted bone age estimation network. The results showed that the proposed method achieved a mean absolute error (MAE) of 5.45 months on the public dataset Radiological Society of North America (RSNA) and 3.34 months on our private dataset.