2015
DOI: 10.3390/s150306668
|View full text |Cite
|
Sign up to set email alerts
|

Beamforming and Power Control in Sensor Arrays Using Reinforcement Learning

Abstract: The use of beamforming and power control, combined or separately, has advantages and disadvantages, depending on the application. The combined use of beamforming and power control has been shown to be highly effective in applications involving the suppression of interference signals from different sources. However, it is necessary to identify efficient methodologies for the combined operation of these two techniques. The most appropriate technique may be obtained by means of the implementation of an intelligen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 30 publications
0
7
0
Order By: Relevance
“…Although the Q-learning convergence criterion requires state-action pairs to be visited infinite times, in practice it is possible to reach quite relevant values when executing a sufficiently large number of iterations (considering the task to be learned). For a problem of 18 states and 2 actions, as described in [2], where Q-learning was used to determine an optimum selection policy between beam conformation and power control of an adaptive arrangement of antennas, the matrix Q convergence happens after approximately 2500 iterations. In the work of [29], Q-learning was used to make adaptive thermal management of multicores systems, in order to improve the reliability and extend their useful life.…”
Section: Q-learning Techniquementioning
confidence: 99%
See 2 more Smart Citations
“…Although the Q-learning convergence criterion requires state-action pairs to be visited infinite times, in practice it is possible to reach quite relevant values when executing a sufficiently large number of iterations (considering the task to be learned). For a problem of 18 states and 2 actions, as described in [2], where Q-learning was used to determine an optimum selection policy between beam conformation and power control of an adaptive arrangement of antennas, the matrix Q convergence happens after approximately 2500 iterations. In the work of [29], Q-learning was used to make adaptive thermal management of multicores systems, in order to improve the reliability and extend their useful life.…”
Section: Q-learning Techniquementioning
confidence: 99%
“…In the problem presented in [29], which has the same number of states and actions of scenario III as the value function matrix, which converges with 5500 iterations, it would need 0.2 ms. In the problem presented in [2], with 2500 iterations, it would take 0.1 ms. If there is a time restriction to acquire the necessary data for the calculation of the policy, it is also possible to reduce the system clock so that, as demonstrated in [11], the system consumption is reduced and the execution time is adjusted for the data acquisition.…”
Section: E Power Consumption Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…A power control schemes based on the reinforcement learning (RL) and mathematical optimization are proposed to consider interference. To determine the optimal combination of beamforming and power control in sensor arrays, the RL algorithm is proposed [ 24 ]. In this case, the power configuration set of this kind of scheme could be explored by [ 25 ].…”
Section: Related Workmentioning
confidence: 99%
“…In this paper, we further optimize COPMA and RMSG techniques to minimize the overhead and power consumption respectively. In [26] the reinforcement learning (RL) based algorithm used to determine the suitable policy for selection between two different techniques, first one is beamforming while the second is power control for sensor array networks, by taking the benefit of the individual features of each method according to SINR threshold. The key issue of the RL is to select the action (BF or PC) according to the learning state in this case (SINR), that is satisfied with the system requirement.…”
Section: Introductionmentioning
confidence: 99%