The Orienteering Problem (OP) seeks a path on a graph to maximize total rewards collected subject to a path length budget. Typically, a reward is achieved by visiting a vertex in the graph, and such a reward is constant for all time. This paper considers a variant of OP where the reward of each vertex is an arbitrary time-dependent function, and hence the name time-varying reward OP (TR-OP). To solve this problem, we develop a novel heuristic search algorithm called Reward Maximization A* (RMA*), which is guaranteed to find an optimal solution to TR-OP. We also develop a fast method to compute an admissible heuristic for RMA* that can effectively direct the search to save computational effort. Furthermore, we introduce a hyper-parameter in RMA* that trades off between solution quality and runtime efficiency for RMA*. We benchmark RMA* against a recent dynamic programming (DP) approach, which runs fast in practice, but has no guarantee of the solution optimality. In our tests, RMA* reduces the runtime by up to 70% compared to DP. By adjusting the hyper-parameter, RMA* is able to find solutions with up to 30% more rewards than those found by DP.