In-network learning (INL), which is one of the distributed machine learning (ML), enables both learning and inference over a network by deploying small models representing parts of the global neural network (NN) model on the network. The inference and learning speed of INL model depends on the computation and communication cost. Therefore, the small NN models should be properly deployed on the physical network to meet the service requirements. In other words, INL involves an NN placement problem to aim at minimizing the maximum end-to-end delay. In this paper, we formulate this NN placement problem as a mixed integer linear program (MILP) and develop a heuristic algorithm to overcome its computational complexity. Through the numerical experiments, the proposed MILP works well up to 400 physical nodes. However, the MILP cannot output the optimal solution in case of more than 400 physical nodes, due to the time limit or the out of memory. The proposed heuristic algorithm can provide the feasible solution with a shorter computation time against the physical network size at the sacrifice of the solution optimality, compared with the MILP.