Herein, we explore the efficient estimation of statistical quantities, particularly rare event probabilities, for stochastic reaction networks and biochemical systems. To this end, we propose a novel importance sampling (IS) approach to improve the efficiency of Monte Carlo (MC) estimators when based on an approximate tau-leap scheme. The crucial step in the IS framework is choosing an appropriate change of probability measure for achieving substantial variance reduction. Typically, this is challenging and often requires insights into the given problem. Based on an original connection between finding the optimal IS parameters within a class of probability measures and a stochastic optimal control (SOC) formulation, we propose an automated approach to obtain a highly efficient path-dependent measure change. The optimal IS parameters are obtained by solving a variance minimization problem. We begin by deriving an associated backward equation solved by these optimal parameters. Given the challenge of analytically solving this backward equation, we propose a numerical dynamic programming algorithm to approximate the optimal control parameters. In the one-dimensional case, our numerical results show that the variance of our proposed estimator decays at a rate of O(∆t) for a step size of ∆t, compared to O(1) for a standard MC estimator. For a given prescribed error tolerance, TOL, this implies an improvement in the computational complexity to become O(TOL −2 ) instead of O(TOL −3 ) when using a standard MC estimator. To mitigate the curse of dimensionality issue caused by solving the backward equation in the multi-dimensional case, we propose an alternative learning-based method that approximates the value function using a neural network, the parameters of which are determined via a stochastic optimization algorithm. The learning process uses the optimality criterion of our SOC problem, which relates the optimal controls to the value function. In this exploratory work, our numerical experiments demonstrate that our learning-based IS approach substantially reduces the variance of the MC estimator.