2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC) 2017
DOI: 10.1109/smc.2017.8123163
|View full text |Cite
|
Sign up to set email alerts
|

Machine learning techniques for autonomous agents in military simulations — Multum in parvo

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
9
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 18 publications
(9 citation statements)
references
References 15 publications
0
9
0
Order By: Relevance
“…In this toy-example (discussed for clarification purposes), considering 9 RBFs together with localized observation vectors of size 12 for the predator and 10 for the preys, the mean vector associated with the predator and the preys are of dimensions 9 × 12 and 9 × 10, respectively. Consequently, for this Predator-Prey scenario, µ, which is initialized randomly contains three agents with random values with the mean size ((9, 12), (9, 10), (9,10)) and the covariance, Σ = (I 12 , I 10 , I 10 ) where I 12 and I 10 are the identity…”
Section: Experimental Assumptionsmentioning
confidence: 99%
See 1 more Smart Citation
“…In this toy-example (discussed for clarification purposes), considering 9 RBFs together with localized observation vectors of size 12 for the predator and 10 for the preys, the mean vector associated with the predator and the preys are of dimensions 9 × 12 and 9 × 10, respectively. Consequently, for this Predator-Prey scenario, µ, which is initialized randomly contains three agents with random values with the mean size ((9, 12), (9, 10), (9,10)) and the covariance, Σ = (I 12 , I 10 , I 10 ) where I 12 and I 10 are the identity…”
Section: Experimental Assumptionsmentioning
confidence: 99%
“…Generally speaking, the main underlying objective is learning (via trial and error) from previous interactions of an autonomous agent and its surrounding environment. The optimal control (action) policy can be obtained via RL algorithms through the feedback that environment provides to the agent after each of its actions [3][4][5][6]8,9]. Policy optimality can be reached via such an approach with the goal of increasing the reward over time.…”
Section: Introductionmentioning
confidence: 99%
“…However, virtual humans are not new. They have been used in many domains, including advertisements as well as medical practice [1,2], healthcare [3,4], education [5,6], entertainment [7,8], and the military [9,10], interacting with the user and acting on the environment to provide a positive influence, such as behavioral change of the human counterpart [11]. The emphasis on interactivity with the user brought the term virtual agent, which utilizes verbal (e.g., conversation) and nonverbal communication (e.g., facial expressions and behavioral gestures) channels to learn, adapt, and assist the human.…”
Section: Introductionmentioning
confidence: 99%
“…Generally speaking, the main underlying objective is learning (via trial and error) from previous interactions of an autonomous agent and its surrounding environment. The optimal control (action) policy can be obtained via RL algorithms through the feedback that environment provides to the agent after each of its actions [ 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 ]. Policy optimality can be reached via such an approach with the goal of increasing the reward over time.…”
Section: Introductionmentioning
confidence: 99%