IntroductionCurrently, the phrase Big Data has become most fashionable in IT region. It refers to a broad area of dataset which are tough to be maintained by traditional functions [1]. Big Data can be used in Economics and Commerce, Finance, Electronic shopping, Medicare, Astrophysics, Oceanology, Manufacturing and numerous different areas. These datasets are most difficult. The data size is increasing exponentially day by day in extremely huge quantity. As information is rising in capacity, in variety and with huge speed, it also increases the complexities in handling it. Big Data is a developing area. It has a lot of investigation problems and objections to address. The major problems in Big Data are: i) Managing data quantity, ii) Analysis of Big Data, iii) Privacy of information, iv) Holding of massive quantity of information, v) Information visualization, vi) Job scheduling in Big Data, vii) Fault tolerance.
Research Methodology
Managing information quantityThe huge quantity of information/data imminent from various areas of education such as genetics, astrophysics, weather forecasting, etc makes it extremely hard for the biologists to handle [1,2].
Analysis of big dataIt is hard to diagnose Big Data due to inhomogeneous and incompleteness of information. Composed information can be in various methods, diversity and structure [3].
Privacy of information in the context of big dataThere is a general fear regarding to the improper utilization of individual information, mainly through connecting the information from numerous resources. Handling secrecy is both a scientific and a Sociological issue [3].
Storage of massive quantity of informationIt constitutes the issue of how to identify and cache main data which are separated from unorganized information, proficiently [1,3].
AbstractA huge amount of information (Information in the unit of Exabyte or Zettabyte) is called Big Data. To quantify such a large amount of data and store electronically is not easy. Hadoop system is used to handle these large data sets. To collect Big Data according to the request, Map Reduce program is used. In order to achieve greater performance, Big Data requires proper scheduling. To reduce starvation and increase the use of resource and also to assign the jobs for available resources, the scheduling technique is used. The Performance can be increased by implementing deadline constraints on jobs. The goal of the paper is to study and analyze various scheduling algorithms for better performance.