2018 2nd East Indonesia Conference on Computer and Information Technology (EIConCIT) 2018
DOI: 10.1109/eiconcit.2018.8878550
|View full text |Cite
|
Sign up to set email alerts
|

An Approach of Web Scraping on News Website based on Regular Expression

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(4 citation statements)
references
References 12 publications
0
2
0
Order By: Relevance
“…Achmad M. et al [4] describe a method for automatically retrieving the title, publication date, author, clean text article, and URL address of a news article from the HTML page of three news websites, namely Detik, ribunnews, and Liputan 6, without manually copying and pasting the information. This method consists of three steps: analyzing the structure of news websites, creating Regex patterns, and implementing the patterns as a set of rules for web scraping.…”
Section: Related Workesmentioning
confidence: 99%
“…Achmad M. et al [4] describe a method for automatically retrieving the title, publication date, author, clean text article, and URL address of a news article from the HTML page of three news websites, namely Detik, ribunnews, and Liputan 6, without manually copying and pasting the information. This method consists of three steps: analyzing the structure of news websites, creating Regex patterns, and implementing the patterns as a set of rules for web scraping.…”
Section: Related Workesmentioning
confidence: 99%
“…The websites include Liputan6.com, Detik.com and Tribunnews.com. The reason for choosing the three news websites is because the three news websites have a high level of access in Indonesia [2].…”
Section: ) News Website Selectionmentioning
confidence: 99%
“…The amount of new news that appears every day becomes a new problem when news websites do not provide API services to download these news. The copy and paste method cannot be used to get news from news websites every day because it will take a very long time [2]. Web scraping technique can be a solution to the problem because this technique can retrieve data from a website quickly.…”
Section: Introductionmentioning
confidence: 99%
“…including Sri Lanka, leading to the election of candidates who fail to fulfill their promises and contribute to societal decline [1]. To address these challenges, researchers and scholars have explored various technologies and methodologies to enhance the candidate selection process.…”
Section: Introductionmentioning
confidence: 99%