2019 25th Asia-Pacific Conference on Communications (APCC) 2019
DOI: 10.1109/apcc47188.2019.9026527
|View full text |Cite
|
Sign up to set email alerts
|

Model-Free Unsupervised Learning for Optimization Problems with Constraints

Abstract: In many optimization problems in wireless communications, the expressions of objective function or constraints are hard or even impossible to derive, which makes the solutions difficult to find. In this paper, we propose a model-free learning framework to solve constrained optimization problems without the supervision of the optimal solution. Neural networks are used respectively for parameterizing the function to be optimized, parameterizing the Lagrange multiplier associated with instantaneous constraints, a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
11
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5

Relationship

2
3

Authors

Journals

citations
Cited by 11 publications
(11 citation statements)
references
References 11 publications
0
11
0
Order By: Relevance
“…If the gradients of the OF and CFs w.r.t. x are derivable, then the policy network and multiplier network can be trained by the primal-dual stochastic gradient method that iteratively updates the PNP and MNP along the ascent and descent directions of the Langrangian function's gradients, respectively [14]. The distribution of the environment's status does not have to be known, because the sampled-averaged gradients can be used for replacing the true gradients in the sense of ensemble average.…”
Section: B Learning Generic Functional Optimization Without Labelsmentioning
confidence: 99%
See 4 more Smart Citations
“…If the gradients of the OF and CFs w.r.t. x are derivable, then the policy network and multiplier network can be trained by the primal-dual stochastic gradient method that iteratively updates the PNP and MNP along the ascent and descent directions of the Langrangian function's gradients, respectively [14]. The distribution of the environment's status does not have to be known, because the sampled-averaged gradients can be used for replacing the true gradients in the sense of ensemble average.…”
Section: B Learning Generic Functional Optimization Without Labelsmentioning
confidence: 99%
“…2. The value networks take x and h as the input and then they output the approximated values of the OF and CFs at h. The observed values of the OF and CFs can be used for supervising the training procedure [14]. Specifically, the value networks are trained by minimizing the L 2 -norm loss between their outputs and the observed values of the OF and CFs using SGD, as shown in Fig.…”
Section: A Learning Deterministic Policymentioning
confidence: 99%
See 3 more Smart Citations