2023
DOI: 10.54254/2755-2721/6/20230798
|View full text |Cite
|
Sign up to set email alerts
|

Understand the working of Sqoop and hive in Hadoop

Abstract: In past decades, the structured and consistent data analysis has seen huge success. It is a challenging task to analyse the multimedia data which is in unstructured format. Here the big data defines the huge volume of data that can be processed in distributed format. The big data can be analysed by using the hadoop tool which contains the Hadoop Distributed File System (HDFS) storage space and inbuilt several components are there. Hadoop manages the distributed data which is placed in the form of cluster analy… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 8 publications
0
1
0
Order By: Relevance
“…As for the software program, Linux is selected as the bottom operating system of each server, the version is CentOS 7.5, jdk1.8.0_171 is selected as the JDK version, and Hadoop is selected as v2.6.0, and components such as Spark, Yarn, HDFS, Zookeeper, HBase, Kafka and Redis are also deployed in each node. [5] In addition, the WebCollector-Hadoop version is selected for the WebCollector crawler framework, and it is directly deployed on the Hadoop framework to realize the distributed collection of online public opinion data information. The text processing module completes Jieba word segmentation and TF-IDF feature operation under the Spark framework.…”
Section: System Constructionmentioning
confidence: 99%
“…As for the software program, Linux is selected as the bottom operating system of each server, the version is CentOS 7.5, jdk1.8.0_171 is selected as the JDK version, and Hadoop is selected as v2.6.0, and components such as Spark, Yarn, HDFS, Zookeeper, HBase, Kafka and Redis are also deployed in each node. [5] In addition, the WebCollector-Hadoop version is selected for the WebCollector crawler framework, and it is directly deployed on the Hadoop framework to realize the distributed collection of online public opinion data information. The text processing module completes Jieba word segmentation and TF-IDF feature operation under the Spark framework.…”
Section: System Constructionmentioning
confidence: 99%