“…In this application, we would like to consider sigmoid and tanh activation functions for the hidden layer. Additional hyperparameters we tuned included hidden unit number for µ network (10,15,20), hidden unit number for y 0 network (2, 5, 10), learning rate (0.002, 0.005), and delta for early stopping (0.01, 0.02). There are many other hyperparameters, such as batch size, to be tuned.…”