2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) 2020
DOI: 10.1109/fccm48280.2020.00012
|View full text |Cite
|
Sign up to set email alerts
|

Accelerating Proximal Policy Optimization on CPU-FPGA Heterogeneous Platforms

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 29 publications
(16 citation statements)
references
References 16 publications
0
16
0
Order By: Relevance
“…Additionally, the compute unit contains buffers for the output and updates of neural network layers. [47] higher efficiency than the TRPO architecture according to this measure. For the accelerators implementing full DRL training, the column IPS/LUT provides a point of comparison.…”
Section: ) Comparison Of Policy Gradient Implementationsmentioning
confidence: 82%
See 1 more Smart Citation
“…Additionally, the compute unit contains buffers for the output and updates of neural network layers. [47] higher efficiency than the TRPO architecture according to this measure. For the accelerators implementing full DRL training, the column IPS/LUT provides a point of comparison.…”
Section: ) Comparison Of Policy Gradient Implementationsmentioning
confidence: 82%
“…Another heterogeneous architecture was implemented by Meng et al [47] for the PPO algorithm. It is composed of a host CPU doing the loss and advantage computations and an FPGA, doing the forward propagation, backward propagation, and weight update.…”
Section: ) Implementations Of Policy Gradient Algorithmsmentioning
confidence: 99%
“…Wang et al [2019] and Zhou and Prasanna [2017] have shown that some graph algorithms are similarly well-suited to these platforms. Winterstein and Constantinides [2017] have demonstrated similar results about K-means clustering applications using a different CPU/FPGA system called the Intel Cyclone V. More recently, some machine learning applications have improved their throughput when ported from a CPU/GPU implementation to a CPU/FPGA implementation [Guo et al 2019[Guo et al , 2018Meng et al 2020].…”
Section: Further Related Workmentioning
confidence: 84%
“…Recently, instead of merely performing GCN on GPU (CPU), various experimental platforms are used for accelerating training and inference of GCN. For instance, paralleled platform [56], (multi-) FPGA platform [33,[71][72], and heterogeneous platform [73,76]. On the other hand, the costs in terms of computation and storage of sampling methods are growing large as the explosion of the graph size, putting pressure on existing experimental platforms.…”
Section: Challenges and Future Directionsmentioning
confidence: 99%