Complex problems require considerable work, extensive computation, and the development of effective solution methods. Recently, physical hardware- and software-based technologies have been utilized to support problem solving with computers. However, problem solving often involves human expertise and guidance. In these cases, accurate human evaluations and diagnoses must be communicated to the system, which should be done using a series of real numbers. In previous studies, only binary numbers have been used for this purpose. Hence, to achieve this objective, this paper proposes a new method of learning complex network topologies that coexist and compete in the same environment and interfere with the learning objectives of the others. Considering the special problem of reinforcement learning in an environment in which multiple network topologies coexist, we propose a policy that properly computes and updates the rewards derived from quantitative human evaluation and computes together with the rewards of the system. The rewards derived from the quantitative human evaluation are designed to be updated quickly and easily in an adaptive manner. Our new framework was applied to a basketball game for validation and demonstrated greater effectiveness than the existing methods.