With the advance of computational techniques, the amount of genomic data has risen exponentially, with a rapid rate [1] making it hard to utilize such data in the medical field without appropriate pre-processing, which in turn leads to more complexity and veracity issues [2] eventually creating multiple complications such as storage, analysis, privacy and security. Therefore, genomic data may look easy to handle in terms of its volume, but it actually requires quite a complicated process due to the complexity, heterogeneity and hybridity of its features. This process is entitled knowledge discovery process [3]: • Data recording Includes the different challenges and tools regarding the capture and storage of data. • Data pre-processing Which includes all the operations of cleaning and appropriation of the captured data to the ready to analyze form in order to optimize the analysis step. • Data analysis The task of evaluating data using different algorithms following a logical reasoning to examine each component of the data provided, with the aim of dispensing insightful outcomes.