Stock portfolio is a hard issue in the Fintech field due to the diversity of data characteristics and the dynamic complexity of the market. Despite advances in deep learning that have made great progress in the complex and highly stochastic portfolio problem, the existing research still faces significant limitations. They either consider only investment returns or simply use some macro-market data to guide their models against risk. The preferred direction of the market greatly affects the choice of stock. And in practice, investors are more inclined to portfolios with low correlation between assets because of the ripple relationships between related things. In this paper, we propose a novel framework, called Mercury, which views stock screening as a reinforcement learning process. In particular, to enhance the ability to perceive changes in the market and generate higher returns, our framework models the sensitivity of the market preferences and learns dynamic temporal and spatial dependency patterns between assets from historical trading data. Additionally, the framework employs reinforcement learning to screen the overall low-correlation portfolio, which can better improve the ability to withstand investment risks while guaranteeing returns. The daily dataset of China's A-share market is used as the research sample to verify the effectiveness and robustness of Mercury, and our framework has strong generalization ability, which can be easily generalized to other trading procedures.INDEX TERMS Deep reinforcement learning, risk-return balanced portfolio strategy, market preferences, low-correlation assets.