“…To produce node, job, and global embeddings, we used neural networks with six fullyconnected layers each, containing [16,8,8,16,8,8] neurons in their layers, respectively. Finally, two neural network with four fully-connected layers, containing [32,16,8,1] neurons respectively, mapped the embeddings to actions-i.e., a node selected for scheduling and a number of executors to assign. Training was performed using the REINFORCE policy-gradient algorithm executed on 16 workers.…”