Collecting reference papers from the Internet is one of the most important activities for progressing research and writing papers about their results. Unfortunately, the current process using Google Scholar may not be efficient, since a lot of paper files cannot be accessed directly by the user. Even if they are accessible, their effectiveness needs to be checked manually. In this paper, we propose a reference paper collection system using web scraping to automate paper collections from websites. This system can collect or monitor data from the Internet, which is considered as the environment, using Selenium, a popular web scraping software, as the sensor; this examines the similarity against the search target by comparing the keywords using the Bert model. The Bert model is a deep learning model for natural language processing (NLP) that can understand context by analyzing the relationships between words in a sentence bidirectionally. The Python Flask is adopted at the web application server, where Angular is used for data presentations. For the evaluation, we measured the performance, investigated the accuracy, and asked members of our laboratory to use the proposed method and provide their feedback. Their results confirm the method’s effectiveness.