2017 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM) 2017
DOI: 10.1109/ieem.2017.8290312
|View full text |Cite
|
Sign up to set email alerts
|

Development of an entropy-based feature selection method and analysis of online reviews on real estate

Abstract: In recent years, data posted about real estate on the Internet is currently increasing. In this study, in order to analyze user needs for real estate, we focus on "Mansion Community" which is a Japanese bulletin board system (hereinafter referred to as BBS) about Japanese real estate. In our study, extraction of keywords is performed based on calculation of the entropy value of each word, and we used them as features in a machine learning classifier to analyze 6 million posts at "Mansion Community". As a resul… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 12 publications
0
3
0
Order By: Relevance
“…In another study, the linear regression method and the gradient boosting approach are compared for the modelling of the property prices in China considering 253 units where it is demonstrated that the gradient boosting method is more accurate [10]. The SVM methodology is applied for the modelling of the housing prices in Japan for 6320631 advertisements where it is shown that the SVM approach is an effective method for the property price modelling [11]. The three types of regression namely the linear regression, multivariate regression and the polynomial regression are utilized for the modelling of the housing prices in India considering 21000 properties and it is found out that the mixture of these three methods performs the best [12].…”
Section: Literature Surveymentioning
confidence: 99%
“…In another study, the linear regression method and the gradient boosting approach are compared for the modelling of the property prices in China considering 253 units where it is demonstrated that the gradient boosting method is more accurate [10]. The SVM methodology is applied for the modelling of the housing prices in Japan for 6320631 advertisements where it is shown that the SVM approach is an effective method for the property price modelling [11]. The three types of regression namely the linear regression, multivariate regression and the polynomial regression are utilized for the modelling of the housing prices in India considering 21000 properties and it is found out that the mixture of these three methods performs the best [12].…”
Section: Literature Surveymentioning
confidence: 99%
“…There are two categories of problemsolving using machine learning namely supervised learning and unsupervised learning algorithms (Fiorucci & James, 2020). Between the two, the most commonly used is the supervised learning algorithm specifically for predicting ϒ (Horino & Nonaka, 2017). The supervised learning algorithm is used for predicting the outcome of a given input; it uses the examples of the input or output pairs and requires human effort to create a training set for building the machine learning algorithm process (Jordan & Mitchell, 2015).…”
Section: Machine Learningmentioning
confidence: 99%
“…Random Forest is also known as an ensemble learning which can be used in classification and regression problem methods; one of its advantages is protecting against overfitting which in turn improves performance(Sabbeh, 2018). It is usually used as a decision-making tool in real estate specifically for predicting the price of housing(Horino & Nonaka, 2017). Random Forest has a Decision Tree collection known as "Forest", but Random Forest generalizes better than Decision Tree towards improving accuracy i.e., by selecting the highest votes(Fiorucci & James, 2020).Next is the Linear Regression algorithm or better known as ordinary least square OLS(Varma & Sarma, 2018).…”
mentioning
confidence: 99%