Recently, reconfigurable intelligent surface (RIS) relaying is suggested as a talented technology for extending the millimeter wave (mmWave) coverage. However, finding out the best RIS relay maximizing the achievable data rate is a too time-consuming process due to the beamforming training (BT) procedure needed for adjusting the antenna phase shifts (PSs) of both mmWave base station (BS) and the probed RIS relay. Thus, finding out the best RIS relay with the minimum BT time cost seems challenging. In this paper, a cost-effective online learning approach is proposed by means of multi-armed bandit (MAB) hypothesis to address this problem. In this context, two MAB schemes with time-cost efficiency, MAB-CE1, and MAB-CE2, are proposed. In MAB-CE1, the BT time cost of selecting the RIS relay is included in the exploitation term of the MAB algorithm. However, in MAB-CE2, lower and upper confidence bounds (LCB, UCB) values of the expected RISs' achievable spectral efficiencies are utilized to support the selection of the RIS relay characterized by the minimum BT time cost. Numerical analysis shows the superior performance of the proposed cost-effective MAB schemes for RIS mmWave relaying over other benchmarks in terms of BT time cost and the achievable throughput.