In opportunistic channel access, the user needs to make real time decisions on when and which channel to access with uncertainty. Assuming perfect channel statistics, several studies have applied optimal stopping theory to derive control strategy for sequential sensing/probing based opportunistically accessing (s-SPA), exploiting temporary opportunities among multiple channels. Meanwhile, numerous multi-arm bandit (MAB)-based approaches have been proposed for online learning of channel selection in periodical sensing/accessing system, however, these schemes fail to exploit the opportunistic diversity in short term. In this paper, we investigate online learning of optimal control in s-SPA systems, where both statistics learning and temporary opportunity utilization are jointly considered. An effective and efficient online policy, so called IE-OSP, is proposed, which theoretically guarantees system converges to the optimal s-SPA strategy with bounded probability. Experimental results further show that, the regret of IE-OSP is almost in optimal logarithmic increasing rate over time, and is sub-linear with the increasing number of channels. Compared with existing solutions, our proposed algorithm achieves 25 ∼ 30% throughput gain in typical scenarios.Index Terms-Opportunistic spectrum access, sequential sensing and accessing, online learning, diversity exploitation.