Genomic Data is growing very rapidly with the sequencing of genomes of various forms of life. To understand the overwhelming data and to obtain meaningful information, Data Mining techniques such as Principal Component Analysis and Discriminant Analysis are used for the purpose. Data Mining is basically used when the data is vast and there is need to extract the hidden knowledge in the form of useful patterns. The data set taken into consideration is protein data pertaining to diabetes mellitus obtained from a database. The task at hand was to find out in which species most of the diabetes related proteins exist. It so happened that most of these proteins were prevalent in Human Beings, House Mice and Norway Rat as they are all mammals and Human Beings have orthologs as House Mice and Norway Rat. Both these techniques prove that human beings show a variation from those of House Mice and Norway Rat which are similar in terms of the variation of protein attributes. This can also be inferred from statistical analysis by using histograms and bivariate plots. Other Data Mining Techniques such as Regression and Clustering can be used to further explore the above inference.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.