Objective: Magnetoencephalography (MEG) based Brain-Computer Interface (BCI) involves a large number of sensors allowing better spatiotemporal resolution for assessing brain activity patterns. There have been many efforts to develop BCI using MEG with high accuracy, though an increase in the number of channels means an increase in computational complexity. However, not all sensors necessarily contribute significantly to an increase in classification accuracy and specifically in the case of MEG-based BCI no channel selection methodology has been performed. Therefore, this study investigates the effect of channel selection on the performance of MEG-based BCI. Approach: MEG data were recorded for two sessions from 15 healthy participants performing motor imagery, cognitive imagery and a mixed imagery task pair using a unique paradigm. Performance of four state-of-the-art channel selection methods (i.e. Class-Correlation (CC), ReliefF (RF), Random Forest (RandF), and Infinite Latent Feature Selection (ILFS) were applied across six binary tasks in three different frequency bands) was evaluated in this study on two state-ofthe-art features i.e. bandpower and CSP. Main results: All four methods provided a statistically significant increase in classification accuracy (CA) compared to a baseline method using all gradiometer sensors, i.e. 204 channels with band-power features from alpha (8-12Hz), beta (13-30Hz), or broadband (α+β) (8-30Hz). It is also observed that the alpha frequency band performed better than the beta and broadband frequency bands. The performance of the beta band gave the lowest CA compared with the other two bands. Channel selection improved accuracy irrespective of feature types. Moreover, all the methods reduced the number of channels significantly, from 204 to a range of 1-25, using bandpower as a feature and from 15-105 for CSP. The optimal channel number also varied not only in each session but also for each participant. Reducing the number of channels will help to decrease the computation cost and maintain numerical stability in cases of low trial numbers. Significance: The study showed significant improvement in performance of MEG-BCI with channel selection irrespective of feature type and hence can be successfully applied for BCI applications.