Citation: Ahmed OW, Qahwaji RSR, Colak T, Higgins PAB, Gallagher P and Bloomfield S (2013) Solar flare prediction using advanced feature extraction, machine learning and feature selection. Solar Physics. 283(1): 157-175. Abstract: Novel machine-learning and feature-selection algorithms have been developed to study: (i) the flare prediction capability of magnetic feature (MF) properties generated by the recently developed Solar Monitor Active Region Tracker (SMART); (ii) SMART's MF properties that are most significantly related to flare occurrence. Spatio-temporal association algorithms are developed to associate MFs with flares from April 1996 to December 2010 in order to differentiate flaring and non-flaring MFs and enable the application of machine learning and feature selection algorithms. A machine-learning algorithm is applied to the associated datasets to determine the flare prediction capability of all 21 SMART MF properties. The prediction performance is assessed using standard forecast verification measures and compared with the prediction measures of one of the industry's standard technologies for flare prediction that is also based on machine learning -Automated Solar Activity Prediction (ASAP). The comparison shows that the combination of SMART MFs with machine learning has the potential to achieve more accurate flare prediction than ASAP. Feature selection algorithms are then applied to determine the MF properties that are most related to flare occurrence. It is found that a reduced set of 6 MF properties can achieve a similar degree of prediction accuracy as the full set of 21 SMART MF properties.
ABSTRACT:Novel machine-learning and feature-selection algorithms have been developed to study: (i) the flare prediction capability of magnetic feature (MF) properties generated by the recently developed Solar Monitor Active Region Tracker (SMART); (ii) SMART's MF properties that are most significantly related to flare occurrence. Spatio-temporal association algorithms are developed to associate MFs with flares from April 1996 to December 2010 in order to differentiate flaring and non-flaring MFs and enable the application of machine learning and feature selection algorithms. A machine-learning algorithm is applied to the associated datasets to determine the flare prediction capability of all 21 SMART MF properties. The prediction performance is assessed using standard forecast verification measures and compared with the prediction measures of one of the industry's standard technologies for flare prediction that is also based on machine learning -Automated Solar Activity Prediction (ASAP). The comparison shows that the combination of SMART MFs with machine learning has the potential to achieve more accurate flare prediction than ASAP. Feature selection algorithms are then applied to determine the MF properties that are most related to flare occurrence. It is found that a reduced set of 6 MF properties can achieve a similar degree of prediction accuracy as the full set of 21 SMART MF properties.