We propose a dual-hormone control algorithm by exploiting deep reinforcement learning (RL) for people with Type 1 Diabetes (T1D). Specifically, double dilated recurrent neural networks are used to learn the hormone delivery strategy, trained by a variant of Q-learning, whose inputs are raw data of glucose & meal carbohydrate and outputs are the actions to deliver dual-hormone (basal insulin and glucagon). Without prior knowledge of the glucose-insulin metabolism, we develop the data-driven model in the UVA/Padova Simulator. We first pre-train the generalized model in an average T1D environment with a long-term exploration, then adopt importance sampling to train personalized models for each individual. In-silico, the proposed algorithm largely reduces adverse glycemic events, and achieves time in range, i.e., the percentage of normoglycemia, 93% for the adults and 83% for the adolescents, which outperforms previous approaches significantly. These results indicate that deep RL has great potential to improve the treatment of chronic illnesses.