Stress affects individual of all ages as a regular part of life, but excessive and chronic stress can lead to physical and mental health problems, decreased productivity, and reduced quality of life. By identifying stress at an early stage, individuals can take steps to manage it effectively and improve their overall wellbeing. Feature selection is a critical aspect of early stress detection because it helps identify the most relevant and informative features that can differentiate between stressed and non-stressed individuals. This paper firstly proposes a variance based feature selection technique that uses q-learning embedded Starling Murmuration Optimiser (QLESMO) to choose relevant features from a publicly available dataset in which stresses experienced by nurses working during the Covid'19 Pandemic is recorded using bio-signals and user surveys. Furthermore, a comparative study with other metaheuristic based feature selection techniques have been demonstrated. Next, to evaluate the efficacy of the proposed algorithm, 10 benchmark test functions have been used. The reduced feature subset is then classified through a 1D convolutional neural network (CNN) model (QLESMO-CNN) and is seen to perform well in terms of the evaluation metrics in comparison to other competitive algorithms. Finally, the proposed technique is compared with the State-of-the-Art methodologies present in literature. The experiments provide a strong basis to determine features that are most relevant for early mental stress classification using a hybrid model combining CNN, Reinforcement Learning and metaheuristic algorithms.