2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC) 2016
DOI: 10.1109/smc.2016.7844517
|View full text |Cite
|
Sign up to set email alerts
|

Modeling behavior of Computer Generated Forces with Machine Learning Techniques, the NATO Task Group approach

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(3 citation statements)
references
References 5 publications
0
3
0
Order By: Relevance
“…Although there is a study of training two combatants CGF with a supervised learning [9], supervised and unsupervised learnings are not widely used, as they require a large amount of data. In addition, it is difficult to judge whether a certain combat activity is right or wrong by using those learning methods [10]. On the other hand, reinforcement learning has the advantage in that it does not require correct answers, as agents build data by repeating episodes in the environment and improve behaviors in correct directions through rewards.…”
Section: B Cgf Automation Methodology and Reinforcement Learningmentioning
confidence: 99%
“…Although there is a study of training two combatants CGF with a supervised learning [9], supervised and unsupervised learnings are not widely used, as they require a large amount of data. In addition, it is difficult to judge whether a certain combat activity is right or wrong by using those learning methods [10]. On the other hand, reinforcement learning has the advantage in that it does not require correct answers, as agents build data by repeating episodes in the environment and improve behaviors in correct directions through rewards.…”
Section: B Cgf Automation Methodology and Reinforcement Learningmentioning
confidence: 99%
“…Generally speaking, the main underlying objective is learning (via trial and error) from previous interactions of an autonomous agent and its surrounding environment. The optimal control (action) policy can be obtained via RL algorithms through the feedback that environment provides to the agent after each of its actions [ 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 ]. Policy optimality can be reached via such an approach with the goal of increasing the reward over time.…”
Section: Introductionmentioning
confidence: 99%
“…Generally speaking, the main underlying objective is learning (via trial and error) from previous interactions of an autonomous agent and its surrounding environment. The optimal control (action) policy can be obtained via RL algorithms through the feedback that environment provides to the agent after each of its actions [3][4][5][6]8,9]. Policy optimality can be reached via such an approach with the goal of increasing the reward over time.…”
Section: Introductionmentioning
confidence: 99%