“…However, the majority of existing theoretical analysis of RL is only applicable to the tabular setting (see, e.g., [1][2][3][4][5][6]), in which both the state and action spaces are discrete and finite, and no function approximation is involved. Relatively simple function approximation methods, such as the linear model in [7,8] or generalized linear model in [9,10], have been recently studied in the context of RL with various statistical estimates. Yet, these results are not sufficient to explain the practical success of RL algorithms in high dimensions.…”