Recent studies and literature reviews have shown promising results for 3GPP system solutions in unlicensed bands when coexisting with Wi-Fi, either by using the duty cycle (DC) approach or licensed-assisted access (LAA). However, it is widely known that general performance in these coexistence scenarios is dependent on traffic and how the duty cycle is adjusted. Most DC solutions configure their parameters statically, which can result in performance losses when the scenario experiences changes on the offered data. In our previous works, we demonstrated that reinforcement learning (RL) techniques can be used to adjust DC parameters. We showed that a Q-learning (QL) solution that adapts the LTE DC ratio to the transmitted data rate can maximize the Wi-Fi/LTE-Unlicensed (LTE-U) aggregated throughput. In this paper, we extend our previous solution by implementing a simpler and more efficient algorithm based on multiarmed bandit (MAB) theory. We evaluate its performance and compare it with the previous one in different traffic scenarios. The results demonstrate that our new solution offers improved balance in throughput, providing similar results for LTE and Wi-Fi, while still showing a substantial system gain. Moreover, in one of the scenarios, our solution outperforms the previous approach by 6% in system throughput. In terms of user throughput, it achieves more than 100% gain for the users at the 10th percentile of performance, while the old solution only achieves a 10% gain.