2018
DOI: 10.1109/tbdata.2017.2721431
|View full text |Cite
|
Sign up to set email alerts
|

Content-Aware Partial Compression for Textual Big Data Analysis in Hadoop

Abstract: This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Abstract-A substantial amount of information in companies and on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. Compressio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
6
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(6 citation statements)
references
References 32 publications
0
6
0
Order By: Relevance
“…This work has presented a comprehensive representation scheme of the involuntary regulation of the behavior of student. Apart from this, there are various other schemes towards big data analytics e.g., tensor-based scheme [27], compression based on context [28], pattern analysis [29], deep learning [30], clustering technique [31]. Existing system has also witnessed an extensive usage of Hadoop framework.…”
Section: Related Workmentioning
confidence: 99%
“…This work has presented a comprehensive representation scheme of the involuntary regulation of the behavior of student. Apart from this, there are various other schemes towards big data analytics e.g., tensor-based scheme [27], compression based on context [28], pattern analysis [29], deep learning [30], clustering technique [31]. Existing system has also witnessed an extensive usage of Hadoop framework.…”
Section: Related Workmentioning
confidence: 99%
“…The limitation of Fast-Sec approach is to encryption size is large for a given plain test size, which increases storage space. Record aware partial compression (RAPC) has proposed by Dong et al [7], which saves storage space . The plaintext data and security keys are not exposed to main memory and secondary storage.…”
Section: Related Workmentioning
confidence: 99%
“…This exceedingly massive data makes the conventional data storage mechanisms inadequate within a tolerable time, and therefore the data storage is one of the major challenges in big data [ 3 ]. Note that storing all the data becomes more and more dispensable nowadays, and it is also not conducive to reduce data transmission costs [ 4 , 5 ]. In fact, data compression storage is widely adopted in many applications, such as IoT [ 2 ], industrial data platform [ 6 ], bioinformatics [ 7 ], wireless networking [ 8 ].…”
Section: Introductionmentioning
confidence: 99%
“…For example, Reference [ 14 ] proposed an adaptive compression scheme in IoT systems, and Reference [ 15 ] investigated the backlog-adaptive source coding system in terms of age of information. In fact, most of the previous compression methods usually carried out compression by means of contextual data or leveraging data transformation techniques [ 4 ].…”
Section: Introductionmentioning
confidence: 99%