2017
DOI: 10.15803/ijnc.7.2_271
|View full text |Cite
|
Sign up to set email alerts
|

Automated Dataset Construction from Web Resources with Tool Kayur

Abstract: Many text mining tools cannot be applied directly to documents available on web pages. There are tools for fetching and preprocessing of textual data, but combining them with the data processing tool into one working tool chain can be time consuming. The preprocessing task is even more labor-intensive if documents are located on multiple remote sources with different storage formats.In this paper, we propose the simplification of data preparation process for cases when data come from wide range of web resource… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2017
2017
2017
2017

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 16 publications
0
1
0
Order By: Relevance
“…They might be generated in later stages of software development such as when a developer starts working on the issue and need clarification. Change information can be extracted using the application programming interface of each issue tracking system or text mining tools such as Kurya by Kohan et al, which is tailored specifically for mining issue tracking systems. Their tool can extract up to 3.5 documents per second at a network speed of 50 Mbps for responsive resources.…”
Section: Proposed Techniquementioning
confidence: 99%
“…They might be generated in later stages of software development such as when a developer starts working on the issue and need clarification. Change information can be extracted using the application programming interface of each issue tracking system or text mining tools such as Kurya by Kohan et al, which is tailored specifically for mining issue tracking systems. Their tool can extract up to 3.5 documents per second at a network speed of 50 Mbps for responsive resources.…”
Section: Proposed Techniquementioning
confidence: 99%