This study focuses on optimizing the performance of an uplink pairwise Non-Orthogonal Multiple Access (NOMA) scenario with and without the support of a relayer, while subject to jamming attacks. We consider two different relaying protocols, one where the sources and the destination are within range of each other and one where they are not. The relay node can be mobile, e.g., a mobile base station, an unmanned aerial vehicle (UAV) or a stationary node that is chosen as a result of a relay selection procedure. We also benchmark with a NOMA retransmission protocol and an Orthogonal Multiple Access (OMA) scheme without a relayer. We analyze, adjust and compare the four protocols for different settings using outage analysis, which is an efficient tool for establishing communication reliability for both individual nodes and the overall wireless network. Closed-form expressions of outage probabilities can be adopted by deep reinforcement learning (RL) algorithms to optimize wireless networks online. Accordingly, we first derive closed-form expressions for the individual outage probability (IOP) of each source node link and the relayer link using both pairwise NOMA and OMA. Next, we analyze the IOP for one packet (IOPP) for each source node considering all possible links between the source node to the destination, taking both phases into account for the considered protocols when operating in Nakagami-m fading channels. The overall outage probability for all packets (OOPP) is defined as the maximum IOPP obtained among the source nodes. This metric is useful to optimize the whole wireless network, e.g., to ensure fairness among the source nodes. Then, we propose a method using deep RL where the OOPP is used as a reward function in order to adapt to the dynamic environment associated with jamming attacks. Finally, we discuss valuable guidelines for enhancing the communication reliability of the legitimate system.