Combination of machine learning-based automatic valuation models for residential properties in South Korea

Jengei Hong; Woo-sung Kim

doi:10.3846/ijspm.2022.17909

DOI: https://doi.org/10.3846/ijspm.2022.17909

Abstract

The applicability of machine learning (ML) techniques has recently been expanding to include automatic real estate valuation models. The main advantage of this technique is that it can better capture complexity in the value determination process. Therefore, the performance of these techniques is shown to be superior to conventional models. In this paper, the latest ML algorithms (i.e., support vector machine, random forest, XGBoost, LightGBM, and CatBoost algorithms) are examined as automatic valuation models, and several combination methods are proposed to improve the models’ predictive power. We applied ML models to approximately 57,000 records on apartment transactions, which were provided by South Korea’s Ministry of Land, Infrastructure, and Transport, that occurred in Seoul in 2018. The results are as follows. First, ML-based predictors (especially, the latest decision tree-based algorithms) are more performative than conventional models. Second, the prediction error from a model can be partially offset by another model’s error, which implies that an efficient averaging of the predictors improves their predictive accuracy. Third, the models’ relative performance may be relearned by the ML algorithms, which means that they can also be used to recommend which algorithm should be selected for making predictions.

Keyword : automatic valuation model, mass appraisal, machine learning (ML) techniques, combined approach, decision tree-based algorithms

How to Cite

Hong, J., & Kim, W.- sung. (2022). Combination of machine learning-based automatic valuation models for residential properties in South Korea. International Journal of Strategic Property Management, 26(5), 362–384. https://doi.org/10.3846/ijspm.2022.17909

Published in Issue

Nov 17, 2022

Abstract Views

937

PDF Downloads

733

This work is licensed under a Creative Commons Attribution 4.0 International License.

References

Adamczyk, T., & Bieda, A. (2015). The applicability of time series analysis in real estate valuation. Geomatics and Environmental Engineering, 9(2), 15–25. https://doi.org/10.7494/geom.2015.9.2.15

Amit, Y., & Geman, D. (1997). Shape quantization and recognition with randomized trees. Neural Computation, 9(7), 1545–1588. https://doi.org/10.1162/neco.1997.9.7.1545

Antipov, E. A., & Pokryshevskaya, E. B. (2012). Mass appraisal of residential apartments: an application of Random Forest for valuation and a CART-based approach for model diagnostics. Expert Systems with Applications, 39(2), 1772–1778. https://doi.org/10.1016/j.eswa.2011.08.077

Bellotti, A. (2017). Reliable region predictions for automated valuation models. Annals of Mathematics and Artificial Intelligence, 81(1–2), 71–84. https://doi.org/10.1007/s10472-016-9534-6

Binoy, B. V., Naseer, M. A., Kumar, P. A., & Lazar, N. (2022). A bibliometric analysis of property valuation research. International Journal of Housing Markets and Analysis, 15(1), 35–54. https://doi.org/10.1108/IJHMA-09-2020-0115

Bogin, A. N., & Shui, J. (2020). Appraisal accuracy and automated valuation models in rural areas. Journal of Real Estate Finance and Economics, 60(1–2), 40–52. https://doi.org/10.1007/s11146-019-09712-0

Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992, July). A training algorithm for optimal margin classifiers. In Proceedings of the 5th Annual Workshop on Computational Learning Theory (pp. 144–152). Association for Computing Machinery. https://doi.org/10.1145/130385.130401

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324

Cannon, S. E., & Cole, R. A. (2011). How accurate are commercial real estate appraisals? Evidence from 25 years of NCREIF sales data. Journal of Portfolio Management, 37(5), 68–88. https://doi.org/10.3905/jpm.2011.37.5.068

Chau, K. W., & Chin, T. L. (2003). A critical review of literature on the hedonic price model. International Journal for Housing Science and its Applications, 27(2), 145–165.

Chau, K., Wong, S., Yiu, C., & Leung, H. (2005). Real estate price indices in Hong Kong. Journal of Real Estate Literature, 13(3), 337–356. https://doi.org/10.1080/10835547.2005.12090166

Chen, J. H., Ong, C. F., Zheng, L., & Hsu, S. C. (2017). Forecasting spatial dynamics of the housing market using support vector machine. International Journal of Strategic Property Management, 21(3), 273–283. https://doi.org/10.3846/1648715X.2016.1259190

Chen, T., & Guestrin, C. (2016, August). Xgboost: a scalable tree boosting system. In Proceedings of the 22nd International Conference on Knowledge Discovery and Data Mining (pp. 785–794). Association for Computing Machinery. https://doi.org/10.1145/2939672.2939785

Chris, A. (2020, July 15). Price rankings by city of price per square meter to buy apartment in city centre (buy apartment price). https://www.numbeo.com/cost-of-living/city_price_rankings?itemId=100

Čeh, M., Kilibarda, M., Lisec, A., & Bajat, B. (2018). Estimating the performance of random forest versus multiple regression for predicting prices of the apartments. ISPRS International Journal of Geo-Information, 7(5), 168. https://doi.org/10.3390/ijgi7050168

Deaconu, A., Buiga, A., & Tothăzan, H. (2022). Real estate valuation models performance in price prediction. International Journal of Strategic Property Management, 26(2), 86–105. https://doi.org/10.3846/ijspm.2022.15962

Dimopoulos, T., Tyralis, H., Bakas, N. P., & Hadjimitsis, D. (2018). Accuracy measurement of random forests and linear regression for mass appraisal models that estimate the prices of residential apartments in Nicosia, Cyprus. Advances in Geosciences, 45, 377–382. https://doi.org/10.5194/adgeo-45-377-2018

Do, A. Q., & Grudnitski, G. (1992). A neural network approach to residential property appraisal. Real Estate Appraiser, 58(3), 38–45.

Dorogush, A. V., Ershov, V., & Gulin, A. (2018). Catboost: gradient boosting with categorical features support. arXiv preprint arXiv:1810.11363.

Dubin, R. A., & Sung, C. H. (1990). Specification of hedonic regressions: non-nested tests on measures of neighborhood quality. Journal of Urban Economics, 27(1), 97–110. https://doi.org/10.1016/0094-1190(90)90027-K

Fan, G. Z., Ong, S. E., & Koh, H. C. (2006). Determinants of house price: a decision tree approach. Urban Studies, 43(12), 2301–2315. https://doi.org/10.1080/00420980600990928

Feng, S. T., Peng, C. W., Yang, C. H., & Chen, P. W. (2021). Non-linear relationships between house size and price. International Journal of Strategic Property Management, 25(3), 240–253. https://doi.org/10.3846/ijspm.2021.14607

Fletcher, M., Gallimore, P., & Mangan, J. (2000). Heteroscedasticity in hedonic house price models. Journal of Property Research, 17(2), 93–108. https://doi.org/10.1080/095999100367930

Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29(5), 1189–1232. https://doi.org/10.1214/aos/1013203451

Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics and Data Analysis, 38(4), 367–378. https://doi.org/10.1016/S0167-9473(01)00065-2

Gabrielli, L., & French, N. (2021). Pricing to market: property valuation methods–a practical review. Journal of Property Investment & Finance, 39(5), 464–480. https://doi.org/10.1108/JPIF-09-2020-0101

Garrod, G. D., & Willis, K. G. (1992). Valuing goods’ characteristics: an application of the hedonic price method to environmental attributes. Journal of Environmental Management, 34(1), 59–76. https://doi.org/10.1016/S0301-4797(05)80110-0

Glumac, B., & Des Rosiers, F. (2021). Practice briefing–Automated valuation models (AVMs): their role, their advantages and their limitations. Journal of Property Investment and Finance, 39(5), 481–491. https://doi.org/10.1108/JPIF-07-2020-0086

Gnat, S. (2021). Property mass valuation on small markets. Land, 10(4), 388. https://doi.org/10.3390/land10040388

Guo, J. Q., Chiang, S. H., Liu, M., Yang, C. C., & Guo, K. Y. (2020). Can machine learning algorithms associated with text mining from internet data improve housing price prediction performance? International Journal of Strategic Property Management, 24(5), 300–312. https://doi.org/10.3846/ijspm.2020.12742

Han, X., & Clemmensen, L. (2014). On weighted support vector regression. Quality and Reliability Engineering International, 30(6), 891–903. https://doi.org/10.1002/qre.1654

Hannonen, M. (2005). An analysis of land prices: a structural time‐series approach. International Journal of Strategic Property Management, 9(3), 145–172. https://doi.org/10.3846/1648715X.2005.9637534

Ho, T. K. (1995, August). Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition (Vol. 1, pp. 278–282). IEEE Publications.

Ho, W. K., Tang, B. S., & Wong, S. W. (2021). Predicting property prices with machine learning algorithms. Journal of Property Research, 38(1), 48–70. https://doi.org/10.1080/09599916.2020.1832558

Hong, J., Choi, H., & Kim, W. S. (2020). A house price valuation based on the random forest approach: the mass appraisal of residential property in South Korea. International Journal of Strategic Property Management, 24(3), 140–152. https://doi.org/10.3846/ijspm.2020.11544

Huh, S., & Kwak, S. J. (1997). The choice of functional form and variables in the hedonic price model in Seoul. Urban Studies, 34(7), 989–998. https://doi.org/10.1080/0042098975691

Yeap, G. P., & Lean, H. H. (2020). Nonlinear relationship between housing supply and house price in Malaysia. International Journal of Strategic Property Management, 24(5), 313–322. https://doi.org/10.3846/ijspm.2020.12343

Yilmazer, S., & Kocaman, S. (2020). A mass appraisal assessment study using machine learning based on multiple regression and random forest. Land Use Policy, 99, 104889. https://doi.org/10.1016/j.landusepol.2020.104889

Yu, D. (2007). Modeling owner-occupied single-family house values in the city of Milwaukee: a geographically weighted regression approach. GIScience and Remote Sensing, 44(3), 267–282. https://doi.org/10.2747/1548-1603.44.3.267

Kain, J. F., & Quigley, J. M. (1970). Measuring the value of housing quality. Journal of the American Statistical Association, 65(330), 532–548. https://doi.org/10.1080/01621459.1970.10481102

Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T.-Y. (2017). LightGBM: a highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 30, 3146–3154.

Kittler, J., Hatef, M., Duin, R. P. W., & Matas, J. (1998). On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(3), 226–239. https://doi.org/10.1109/34.667881

Kok, N., Koponen, E. L., & Martínez-Barbosa, C. A. (2017). Big data in real estate? From manual appraisal to automated valuation. Journal of Portfolio Management, 43(6), 202–211. https://doi.org/10.3905/jpm.2017.43.6.202

Kryvobokov, M., & Wilhelmsson, M. (2007). Analysing location attributes with a hedonic model for apartment prices in Donetsk, Ukraine. International Journal of Strategic Property Management, 11(3), 157–178. https://doi.org/10.3846/1648715X.2007.9637567

Krogh, A., & Vedelsby, J. (1995). Neural network ensembles, cross validation, and active learning. Advances in Neural Information Processing Systems, 7, 231–238.

Lancaster, K. J. (1966). A new approach to consumer theory. Journal of Political Economy, 74(2), 132–157. https://doi.org/10.1086/259131

Lee, T. W., & Chen, K. (2016). Prediction of house unit price in Taipei City using support vector regression [Conference presentation]. Asia Pacific Industrial Engineering and Management Systems Conference, Taipei City, China.

Levantesi, S., & Piscopo, G. (2020). The importance of economic variables on London real estate market: a random forest approach. Risks, 8(4), 112. https://doi.org/10.3390/risks8040112

Li, M. M., & Brown, H. J. (1980). Micro-neighborhood externalities and hedonic housing prices. Land Economics, 56(2), 125–141. https://doi.org/10.2307/3145857

Liaw, A., & Wiener, M. (2002). Classification and regression by random forest. R News, 2(3), 18–22.

Limsombunchai, V. (2004, June). House price prediction: hedonic price model vs. artificial neural network. In New Zealand Agricultural and Resource Economics Society Conference (pp. 25–26), Blenheim, New Zealand.

Lin, H., & Chen, K. (2011, July). Predicting price of Taiwan real estates by neural networks and support vector regression. In Proceedings of the 15th WSEAS International Conference on Systems (pp. 220–225), Corfu Island, Greece.

Liu, C. L. (2005). Classifier combination based on confidence transformation. Pattern Recognition, 38(1), 11–28. https://doi.org/10.1016/j.patcog.2004.05.013

Lu, C. J., Lee, T. S., & Chiu, C. C. (2009). Financial time series forecasting using independent component analysis and support vector regression. Decision Support Systems, 47(2), 115–125. https://doi.org/10.1016/j.dss.2009.02.001

Malpezzi, S. (2003). Hedonic pricing models: a selective and applied review. Housing Economics and Public Policy, 1, 67–89. https://doi.org/10.1002/9780470690680.ch5

McCluskey, W. J., Deddis, W. G., Lamont, I. G., & Borst, R. A. (2000). The application of surface generated interpolation models for the prediction of residential property values. Journal of Property Investment and Finance, 18(2), 162–176. https://doi.org/10.1108/14635780010324321

McCluskey, W., & Anand, S. (1999). The application of intelligent hybrid techniques for the mass appraisal of residential properties. Journal of Property Investment and Finance, 17(3), 218–239. https://doi.org/10.1108/14635789910270495

McCluskey, W., Davis, P., Haran, M., McCord, M., & McIlhatton, D. (2012). The potential of artificial neural networks in mass appraisal: the case revisited. Journal of Financial Management of Property and Construction, 17(3), 274–292. https://doi.org/10.1108/13664381211274371

McMillan, M. L., Reid, B. G., & Gillen, D. W. (1980). An extension of the hedonic approach for estimating the value of quiet. Land Economics, 56(3), 315–328. https://doi.org/10.2307/3146034

Merz, C., & Pazzani, M. (1996). Combining neural network regression estimates with regularized linear weights. Advances in Neural Information Processing Systems, 9, 564–570.

Myles, A. J., Feudale, R. N., Liu, Y., Woody, N. A., & Brown, S. D. (2004). An introduction to decision tree modeling. Journal of Chemometrics, 18(6), 275–285. https://doi.org/10.1002/cem.873

Pace, R. K., & Hayunga, D. (2020). Examining the information content of residuals from hedonic and spatial models using trees and forests. Journal of Real Estate Finance and Economics, 60(1–2), 170–180. https://doi.org/10.1007/s11146-019-09724-w

Pagourtzi, E., Assimakopoulos, V., Hatzichristos, T., & French, N. (2003). Real estate appraisal: a review of valuation methods. Journal of Property Investment & Finance, 21(4), 383–401. https://doi.org/10.1108/14635780310483656

Pi-ying, L. (2011). Analysis of the mass appraisal model by using artificial neural network in Kaohsiung city. Journal of Modern Accounting and Auditing, 7(10), 1080.

Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., & Gulin, A. (2017). Catboost: unbiased boosting with categorical features. arXiv preprint arXiv:1706.09516.

Raymond, Y. C. (1997). An application of the ARIMA model to real‐estate prices in Hong Kong. Journal of Property Finance, 8(2), 152–163. https://doi.org/10.1108/09588689710167843

Rosen, S. (1974). Hedonic prices and implicit markets: product differentiation in pure competition. Journal of Political Economy, 82(1), 34–55. https://doi.org/10.1086/260169

Safavian, S. R., & Landgrebe, D. (1991). A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man, and Cybernetics, 21(3), 660–674. https://doi.org/10.1109/21.97458

Selim, H. (2009). Determinants of house prices in Turkey: hedonic regression versus artificial neural network. Expert Systems with Applications, 36(2), 2843–2852. https://doi.org/10.1016/j.eswa.2008.01.044

Sheppard, S. (1999). Chapter 41 Hedonic analysis of housing markets. Handbook of Regional and Urban Economics, 3, 1595–1635. https://doi.org/10.1016/S1574-0080(99)80010-8

Sims, S., Dent, P., & Oskrochi, G. R. (2008). Modelling the impact of wind farms on house prices in the UK. International Journal of Strategic Property Management, 12(4), 251–269. https://doi.org/10.3846/1648-715X.2008.12.251-269

Sing, T. F., Yang, J. J., & Yu, S. M. (2022). Boosted tree ensembles for artificial intelligence based automated valuation models (AI-AVM). Journal of Real Estate Finance and Economics, 65, 649–674. https://doi.org/10.1007/s11146-021-09861-1

Smola, A. J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222. https://doi.org/10.1023/B:STCO.0000035301.49549.88

Song, Y. Y., & Lu, Y. (2015). Decision tree methods: applications for classification and prediction. Shanghai Archives of Psychiatry, 27(2), 130–135.

Taniguchi, M., & Tresp, V. (1997) Averaging regularized estimators. Neural Computation, 9(5), 1163–1178. https://doi.org/10.1162/neco.1997.9.5.1163

Torres-Pruñonosa, J., García-Estévez, P., & Prado-Román, C. (2021). Artificial neural network, quantile and semi-log regression modelling of mass appraisal in housing. Mathematics, 9(7), 783. https://doi.org/10.3390/math9070783

Verikas, A., Lipnickas, A., & Malmqvist, K. (2002). Selecting neural networks for a committee decision. International Journal of Neural Systems, 12(5), 351–361. https://doi.org/10.1142/S0129065702001229

Verikas, A., Lipnickas, A., Malmqvist, K., Bacauskiene, M., & Gelzinis, A. (1999). Soft combination of neural classifiers: a comparative study. Pattern Recognition Letters, 20(4), 429–444. https://doi.org/10.1016/S0167-8655(99)00012-4

Wang, D., & Li, V. J. (2019). Mass appraisal models of real estate in the 21st century: a systematic literature review. Sustainability, 11(24), 7006. https://doi.org/10.3390/su11247006

Wikimedia Commons. (2005). Districts of Seoul [Digital image]. https://commons.wikimedia.org/wiki/File:Map_Seoul_districts_de.png

Zhou, G., Ji, Y., Chen, X., & Zhang, F. (2018). Artificial neural networks and the mass appraisal of real estate. International Journal of Online Engineering, 14(3), 180–187. https://doi.org/10.3991/ijoe.v14i03.8420

Zurada, J., Levitan, A., & Guan, J. (2011). A comparison of regression and artificial intelligence methods in a mass appraisal context. Journal of Real Estate Research, 33(3), 349–388. https://doi.org/10.1080/10835547.2011.12091311