“…The power consumption per RF chain P rf , at closed switch P sw , for the 4-bit phase shifter P ph , of the combiner P com , at the power amplifier P amp , and of the baseband P bb is 160 mW [1], 24 mW [25], 42 mW [26], 6.6 mW [27], 60 mW [28], and 200 mW [1], respectively. The hyper-parameter used in the multi-agent DQN algorithm is shown in Table I.…”