Estimating UK House Prices using Machine Learning
Prof Doc Thesis
Awonaike, A. 2022. Estimating UK House Prices using Machine Learning. Prof Doc Thesis University of East London School of Architecture, Computing and Engineering https://doi.org/10.15123/uel.8v108
Authors | Awonaike, A. |
---|---|
Type | Prof Doc Thesis |
Abstract | House price estimation is an important subject for property owners, property developers, investors and buyers. It has featured in many academic research papers and some government and commercial reports. The price of a house may vary depending on several features including geographic location, tenure, age, type, size, market, etc. Existing studies have largely focused on applying single or multiple machine learning techniques to single or groups of datasets to identify the best performing algorithms, models and/or most important predictors, but this paper proposes a cumulative layering approach to what it describes as a Multi-feature House Price Estimation (MfHPE) framework. The MfHPE is a process-oriented, data-driven and machine learning based framework that does not just identify the best performing algorithms or features that drive the accuracy of models but also exploits a cumulative multi-feature layering approach to creating machine learning models, optimising and evaluating them so as to produce tangible insights that enable the decision-making process for stakeholders within the housing ecosystem for a more realistic estimation of house prices. Fundamentally, the MfHPE framework development leverages the Design Science Research Methodology (DSRM) and HM Land Registry’s Price Paid Data is ingested as the base transactions data. 1.1 million London-based transaction records between January 2011 and December 2020 have been exploited for model design, optimisation and evaluation, while 84,051 2021 transactions have been used for model validation. With the capacity for updates to existing datasets and the introduction of new datasets and algorithms, the proposed framework has also leveraged a range of neighbourhood and macroeconomic features including the location of rail stations, supermarkets, bus stops, inflation rate, GDP, employment rate, Consumer Price Index (CPIH) and unemployment rate to explore their impact on the estimation of house prices and their influence on the behaviours of machine learning algorithms. Five machine learning algorithms have been exploited and three evaluation metrics have been used. Results show that the layered introduction of new variety of features in multiple tiers led to improved performance in 50% of models, a change in the best performing models as new variety of features are introduced, and that the choice of evaluation metrics should not just be based on technical problem types but on three components: (i) critical business objectives or project goals; (ii) variety of features; and (iii) machine learning algorithms. |
Year | 2022 |
Publisher | University of East London |
Digital Object Identifier (DOI) | https://doi.org/10.15123/uel.8v108 |
File | License File Access Level Anyone |
Publication dates | |
Online | 12 Oct 2022 |
Publication process dates | |
Submitted | 07 Sep 2022 |
Deposited | 12 Oct 2022 |
https://repository.uel.ac.uk/item/8v108
Download files
948
total views2724
total downloads43
views this month54
downloads this month