Feature selection is employed to reduce feature dimensions and computational complexity by eliminating irrelevant and redundant features. A vast amount of increasing data and its processing generate many feature sets, that are reduced by the feature selection process to improve the performance in all sorts of classification, regression, clustering models. This research performs a detailed analysis of motivation and concentrates on the fundamental architecture of feature selection. The study aims to establish a structured formation related to popular methods such as filter, wrapper, embedded into search strategies, evaluation criteria, and learning methods. Different methods organize a comparison of benefits and drawbacks followed by multiple classification algorithms and standard validation measures. The diversity of applications in multiple domains such as data retrieval, prediction analysis, and medical, intrusion, and industrial applications are efficiently highlighted. The study focused on some additional feature selection methods for handling big data. Nonetheless, new challenges have surfaced in the analysis of such data, which are also addressed in this study. Reflecting on commonly encountered challenges and clarifying how to obtain the absolute feature selection method are the significant components of this study.