Real-word networks are often prone to failures. A reliable network needs to cope with this situation and must provide a backup communication channel. This motivates the study of survivable network design, which has been a focus of research for a few decades. To date, survivable network design problems on undirected graphs are well-understood. For example, there is a 2 approximation in the case of edge failures [Jain, FOCS'98/Combinatorica'01]. The problems on directed graphs, in contrast, have seen very little progress. Most techniques for the undirected case like primal-dual and iterative rounding methods do not seem to extend to the directed case. Almost no non-trivial approximation algorithm is known even for a simple case where we wish to design a network that tolerates a single failure.In this paper, we study a survivable network design problem on directed graphs, 2-Connected Directed Steiner Tree (2-DST): given an n-vertex weighted directed graph, a root r, and a set of h terminals S, find a min-cost subgraph H that has two edge/vertex disjoint paths from r to any t ∈ S. 2-DST is a natural generalization of the classical Directed Steiner Tree problem (DST), where we have an additional requirement that the network must tolerate one failure. No non-trivial approximation is known for 2-DST. This was left as an open problem by Feldman et al., [SODA'09; JCSS] and has then been studied by Cheriyan et al. [SODA'12; TALG] and Laekhanukit [SODA'14]. However, no positive result was known except for the special case of a D-shallow instance [Laekhanukit, ICALP'16].We present an O(D 3 log D • h 2/D • log n) approximation algorithm for 2-DST that runs in time O(n O(D) ), for any D ∈ [log 2 h]. This implies a polynomial-time O(h ε log n) approximation for any constant ε > 0, and a poly-logarithmic approximation running in quasi-polynomial time. We remark that this is essentially the best-known even for the classical DST, and the latter problem is O(log 2−ε n)-hard to approximate [Halperin and Krauthgamer, STOC'03]. As a by product, we obtain an algorithm with the same approximation guarantee for the 2-Connected Directed Steiner Subgraph problem, where the goal is to find a min-cost subgraph such that every pair of terminals are 2-edge/vertex connected.Our approximation algorithm is based on a careful combination of several techniques. In more detail, we decompose an optimal solution into two (possibly not edge disjoint) divergent trees that induces two edge disjoint paths from the root to any given terminal. These divergent trees are then embedded into a shallow tree by means of Zelikovsky's height reduction theorem. On the latter tree we solve a 2-Connected Group Steiner Tree problem and then map back this solution to the original graph. Crucially, our tree embedding is achieved via a probabilistic mapping guided by an LP: This is the main technical novelty of our approach, and might be useful for future work.