This paper investigates a weighting factor based reinforcement learning scheme with a physical information based channel selection policy applied on a multi-hop backhaul wireless network with directional antennas, for a high capacity density wireless system, in order to enhance the spectrum efficiency and Quality of Service (QoS). The interference environment on a multi-hop backhaul network has been analyzed. A novel channel selection policy is designed based on the interference information obtained from the spectrum sensing process, which is incorporated into a multi-hop based learning scheme. It is demonstrated that the weighting factor based reinforcement learning scheme can efficiently partition channels for users in different locations and achieve a significantly higher QoS than conventional approaches. Moreover, the receiver based interference weighted channel selection policy can speed up the learning process in its initial stage.I.