Wouter Mostard scite author profile

Wouter Mostard

2Publications

1Citation Statement Received

48Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Groningen

Publications

Order By: Most citations

Combining Visual and Contextual Information for Fraudulent Online Store Classification

Mostard¹,

Zijlema²,

Wiering

2019

View full text Add to dashboard Cite

Following the rise of e-commerce there has been a dramatic increase in online criminal activities targeting online shoppers. Considering that the number of online stores has risen dramatically, manually checking these stores has become intractable. An automated process is therefore required. We approached this problem by applying machine learning techniques to extract and detect instances of fraudulent online stores. Two sources of information were used to determine the legitimacy of an online store. First, contextual features extracted from the HTML and meta information were used to train various machine learning algorithms. Second, visual information, like the presence of social media logos, was added to make improvements on this baseline model. Results show a positive effect for adding visual information, increasing the F1-score from 0.93 to 0.98 over the baseline model. Finally, this research shows that visual information can improve recall during web crawling. CCS CONCEPTS• Information systems → Web mining; • Computing methodologies → Machine learning.

show abstract

Semantic Preserving Siamese Autoencoder for Binary Quantization of Word Embeddings

Mostard

Schomaker

Wiering

2021

View full text Add to dashboard Cite

Word embeddings are used as building blocks for a wide range of natural language processing and information retrieval tasks. These embeddings are usually represented as continuous vectors, requiring significant memory capacity and computationally expensive similarity measures. In this study, we introduce a novel method for semantic hashing continuous vector representations into lowerdimensional Hamming space while explicitly preserving semantic information between words. This is achieved by introducing a Siamese autoencoder combined with a novel semantic preserving loss function. We show that our quantization model induces only a 4% loss of semantic information over continuous representations and outperforms the baseline models on several word similarity and sentence classification tasks. Finally, we show through cluster analysis that our method learns binary representations where individual bits hold interpretable semantic information. In conclusion, binary quantization of word embeddings significantly decreases time and space requirements while offering new possibilities through exploiting semantic information of individual bits in downstream information retrieval tasks.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Wouter Mostard

Combining Visual and Contextual Information for Fraudulent Online Store Classification

Semantic Preserving Siamese Autoencoder for Binary Quantization of Word Embeddings

Contact Info

Product

Resources

About