The integration of the industrial internet of things (IIoT) and blockchain has become a popular concept that provides IIoT with a trustworthy computing environment. Numerous IIoT nodes together form a decentralized network with rich location-aware computation resources, which can offer great data processing capabilities and low-latency services. However, we still face the challenges of how to efficiently process the massive IIoT data on resource-constrained IIoT nodes by blockchain smart contracts, as their storage capacity only allows them to store limited blockchain data. This work is aimed at improving the smart contract execution efficiency on these IIoT nodes by caching based on deep reinforcement learning. On the one hand, focusing on the characteristics of IIoT, the ledger structure, network architecture, and transaction flow are optimized. IIoT nodes are enabled to store and cache part of block data without affecting global data consistency. On the other hand, we formulated the blockchain caching problem as a Markov decision process and implemented a lightweight caching agent based on deep Q-learning. Proper features and a reward function are defined to minimize the execution delay of smart contracts. The extensive experimental results show that our proposed scheme can effectively reduce the data dissemination costs and smart contract execution delays of IIoT nodes that hold limited blockchain data.