The widespread adoption of distributed energy resources (DERs) leads to resource redundancy in grid operation and increases computation complexity, which underscores the need for effective resource management strategies. In this paper, we present a novel resource management approach that decouples the resource selection and power dispatch tasks. The resource selection task determines the subset of resources designated to participate in the demand response service, while the power dispatch task determines the power output of the selected candidates. A solution strategy based on contextual bandit with DQN structure is then proposed. Concretely, an agent determines the resource selection action, while the power dispatch task is solved in the environment. The negative value of the operational cost is used as feedback to the agent, which links the two tasks in a closed-loop manner. Moreover, to cope with the uncertainty in the power dispatch problem, distributionally robust optimization (DRO) is applied for the reserve settlement to satisfy the reliability requirement against this uncertainty. Numerical studies demonstrate that the DQN-based contextual bandit approach can achieve a profit enhancement ranging from 0.35% to 46.46% compared to the contextual bandit with policy gradient approach under different resource selection quantities.