One of the most promising solutions to overcome the capacity limit of current optical fiber links is space-division multiplexing, which allows the transmission on various cores of multi-core fibers or modes of few-mode fibers. In order to realize such systems, suitable optical fiber amplifiers must be designed. In single mode fibers, Raman amplification has shown significant advantages over doped fiber amplifiers due to its low-noise and spectral flexibility. For these reasons, its use in next-generation space-division multiplexing transmission systems is being studied extensively. In this work, we propose a deep learning method that uses automatic differentiation to embed a complete few-mode Raman amplification model in the training process of a neural network to identify the optimal pump wavelengths and power allocation scheme to design both flat and tilted gain profiles. Compared to other machine learning methods, the proposed technique allows to train the neural network on ideal gain profiles, removing the need to compute a dataset that accurately covers the space of Raman gains we are interested in. The ability to directly target a selected region of the space of possible gains allows the method to be easily generalized to any type of Raman gain profiles, while also being more robust when increasing the number of pumps, modes, and the amplification bandwidth. This approach is tested on a 70 km long 4-mode fiber transmitting over the C+L band with various numbers of Raman pumps in the counter-propagating scheme, targeting gain profiles with an average gain in the interval from 5 dB to 15 dB and total tilt in the interval from −1.425 dB to 1.425 dB. We achieve wavelengthand mode-dependent gain fluctuations lower than 0.04 dB and 0.02 dB per dB of gain, respectively.