2019
DOI: 10.1007/978-3-030-20005-3_5
|View full text |Cite
|
Sign up to set email alerts
|

Mirkwood: An Online Parallel Crawler

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
6
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(6 citation statements)
references
References 7 publications
0
6
0
Order By: Relevance
“…Although our software is designed for parallel architectures, it can also work (albeit slowly) in a single machine: in this case, the forest would produce a single nest with n spiders in it, and all of them would run in that machine. For further architectural and technical details of our tool, please see [56]. For an example of use of our crawler, you can read our previous study on antibiotics [57].…”
Section: Search Enginementioning
confidence: 99%
See 1 more Smart Citation
“…Although our software is designed for parallel architectures, it can also work (albeit slowly) in a single machine: in this case, the forest would produce a single nest with n spiders in it, and all of them would run in that machine. For further architectural and technical details of our tool, please see [56]. For an example of use of our crawler, you can read our previous study on antibiotics [57].…”
Section: Search Enginementioning
confidence: 99%
“…By using private browsing, not allowing cookies and forcing the use of Google's country-independent version, the search engine site had worked well to a point [56][57][58], but it is no longer effective as of 2022. For this research, we had to introduce new technological features, namely VPN usage, multi-country searches and information fusion.…”
Section: Search Enginementioning
confidence: 99%
“…. and confirming the sites that in fact contain the terms we were looking for) was made, using the improved technology we developed in (24). Then, using an improved functionality that we developed for our current study, it further studied each validated website one by one, looking for specific words in each of them, whether we had explicitly searched for them (e.g., "buy antibiotics") or not (e.g., antibiotic names and families).…”
Section: Discussionmentioning
confidence: 73%
“…After all the improvements of our core technology (24), we neither have to download the websites to study nor rely on thirdparty software (Heritrix) since we can now access and study each site "on the fly" as the spider visits them (online, without downloading them), which is faster, more reliable, and gets us fresh and updated information. Once the crawler was developed, tested, and ready to use, we ran it following the aforementioned steps: we made it look for, mark, and validate websites matching the search "buy antibiotics online."…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation