Proceedings of the 11th Working Conference on Mining Software Repositories 2014
DOI: 10.1145/2597073.2597091
|View full text |Cite
|
Sign up to set email alerts
|

Mining modern repositories with elasticsearch

Abstract: Organizations are generating, processing, and retaining data at a rate that often exceeds their ability to analyze it effectively; at the same time, the insights derived from these large data sets are often key to the success of the organizations, allowing them to better understand how to solve hard problems and thus gain competitive advantage. Because this data is so fast-moving and voluminous, it is increasingly impractical to analyze using traditional offline, read-only relational databases.Recently, new "b… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
48
0
2

Year Published

2015
2015
2020
2020

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 114 publications
(50 citation statements)
references
References 1 publication
0
48
0
2
Order By: Relevance
“…To accomplish this necessity, elastic search spreads information to a few physical Lucene lists, these lists are called shards, and every one of the parts of the record is called sharding [1]. Elastic search can do this naturally, and every one of the parts of the record (shards) is unmistakable to the client as one major record.…”
Section: Shardsmentioning
confidence: 99%
See 2 more Smart Citations
“…To accomplish this necessity, elastic search spreads information to a few physical Lucene lists, these lists are called shards, and every one of the parts of the record is called sharding [1]. Elastic search can do this naturally, and every one of the parts of the record (shards) is unmistakable to the client as one major record.…”
Section: Shardsmentioning
confidence: 99%
“…and every one of the parts of the record is called sharding [1]. Elastic search can do this naturally, and every one of the parts of the record (shards) is unmistakable to the client as one major record.…”
Section: Nath Et Almentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, applications and services must be able to scale up to items, domains and data subset of interest. Approaches based on parallel and distributed architectures are spreading also in search engines and indexing systems; for instance, the ElasticSearch 3 engine, which is an open source distributed search engine, designed to be scalable, near real-time capable and providing full-text search capabilities [5].…”
Section: Introductionmentioning
confidence: 99%
“…The present work describes a novel framework which allows the execution of general NLP tasks, through the use of the open source GATE 4 tool [11], on a multi-node cluster based on the open source Apache Hadoop 5 Distributed File system (HDFS). The paper is organized as follows: Section II illustrates related work, in terms of state of the art and open issues for both commercial and research literatures; in Section III, an architectural overview of the proposed system is presented; in Section IV, a validation of the system, performed against a real corpus of web resources, is reported; finally, Section V is left for conclusions and future perspectives.…”
Section: Introductionmentioning
confidence: 99%