Apart from different merits of using conventional gas tungsten arc welding (C-GTAW) process, shallow penetration has been considered as the most important drawback of the process. Recently, in order to cope with the low penetration, using a paste like coating of activating flux during welding process known as activated GTAW (A-GTAW) has been proposed. In this paper, effect of A-GTAW process input parameters (welding speed (S), welding current (C)) and percentage of activating fluxes (TiO2 and SiO2) combination (F)) on the most important quality characteristics (weld bead width (WBW), depth of penetration (DOP), and consequently aspect ratio (ASR)) for AISI316L parts have been considered. The data needed for the modeling and optimization objectives, box-behnken design (BBD) of experiments, back propagation neural network (BPNN), simulated annealing (SA), and particle swarm optimization (PSO) algorithms have been employed. Moreover, PSO algorithm has been used to determine the proper ANN architecture (hidden layers number and their corresponding neurons/nodes) and optimize the proper ANN model to obtain the desired aspect ratio, maximum depth of penetration, and minimum weld bead width. Next, SA algorithm has been used to avoid getting trapped in local minima. Finally, confirmation experimental tests have been carried out to evaluate the performance of the proposed method. Due to the obtained results, the suggested method for modeling and optimization of A-GTAW process is quite efficient (with less than 4% error).