![APPLICATION OF RANDOM FOREST ALGORITHM TO STUDY FACTORS AFFECTING RESIDENTIAL LAND PRICES IN ĐÀ LẠT CITY](https://ezcloud.vn/wp-content/uploads/2024/08/thanh-pho-da-lat.webp)
APPLICATION OF RANDOM FOREST ALGORITHM TO STUDY FACTORS AFFECTING RESIDENTIAL LAND PRICES IN ĐÀ LẠT CITY
APPLICATION OF RANDOM FOREST ALGORITHM TO STUDY FACTORS AFFECTING RESIDENTIAL LAND PRICES IN ĐÀ LẠT CITY
Nguyễn Ngọc Thanh
Nguyễn Thị Thùy An
Trần Thị Ánh Tuyết
College of Agriculture and Forestry, Huế University
Đỗ Minh Tùng
Department of Natural Resources and Environment of Lâm Đồng Province
Abstract:
This study applied the Random Forest algorithm to analyze the factors affecting residential land prices in Đà Lạt city, including state-regulated land prices, transaction time, bank interest rates, distance to the administrative center, width of the adjacent road, plot shape, plot orientation, infrastructure, amenities, location coefficient, and planning factors. A dataset of 100 plots was used to train and validate the model with a 70:30 ratio.
The research results showed that the forecast model achieved the best performance with 500 decision trees. The R² value obtained after the model analysis was 0.98. The factor of distance to the administrative center had the greatest influence on the transacted residential land prices (35.1%), followed by the width of the adjacent road factor with 28.9%. State-regulated prices could explain 10.2% of residential land prices.
Other factors had an insignificant impact on residential land prices. This study has contributed an effective forecasting model in the field of land management and real estate.
Keywords: Residential land prices, influencing factors, random forest, Đà Lạt.
I. INTRODUCTION
In the context of globalization and expanding urbanization, the real estate market has become one of the most important sectors for socio-economic development. Specifically in Vietnam, land prices are not only an economic indicator but also a cultural and social indicator. This has created an urgent need to extensively analyze the factors affecting land prices to support investment decisions and sustainable urban development [2].
In recent years, Đà Lạt city has not only been a favorite destination for domestic and international tourists but has also become a hotspot in the real estate market. The increasing interest from investors has driven land prices in many areas of the city to soar [3]. In Đà Lạt city, the fluctuations in land prices due to high demand, combined with the development of tourism and services, have created new challenges and opportunities for land management.
However, uncontrolled land price increases can lead to negative consequences such as land speculation, reducing access to housing for local residents, and affecting the sustainable development of the city. For this reason, analyzing and understanding the factors affecting land prices becomes extremely important. These factors may include geographical location, surrounding amenities, urban planning, infrastructure, and tourism development potential. From a land management perspective, understanding the factors affecting land prices is a solid basis for policy planning, effective land use management, and infrastructure development.
Although previous studies on land prices and influencing factors exist, most of them focus on a macro scale and rarely explore the local level. This creates a gap in knowledge about the land market in areas with unique cultural and tourism values, where influencing factors can differ significantly from other areas [1]. Additionally, current land valuation methods such as comparison method, income method, surplus method, and adjustment coefficient method cannot reflect the impact of individual factors on land prices [4].
In recent years, the rise of machine learning methods has brought many opportunities for data analysis. In particular, the Random Forest algorithm is known for its simple data processing features, not requiring extensive computer resources. This is a classification and regression method through building decision trees at the training time. Therefore, the goal of this study is to apply the Random Forest algorithm to analyze the influence of different factors on residential land prices, thereby providing an overview of residential land prices in Đà Lạt city.
II. DATA AND RESEARCH METHODS
- Research Data
In this study, we collected information on 100 residential land plots transacted in Đà Lạt city during the period from 2020 to 2023. Twelve factors affecting residential land prices were selected in this study, including state-regulated land prices, transaction time, bank interest rates, distance to the administrative center, width of the adjacent road, plot shape, plot orientation, infrastructure, amenities, location coefficient, and planning factors. These factors are considered input variables for the model. The observed variable is the transacted residential land price. It is regarded as the forecast variable of the model. An initial dataset of 100 survey points was built based on information about the influencing factors and residential land prices.
- Research Methods
Random Forest is a machine learning algorithm used in classification and regression problems. In this study, the “Random Forest” package in R was used to implement the residential land price analysis model. The research data was divided into two parts, including training data and validation data at a 70:30 ratio. This allows the model to learn from the training data and then test its predictive ability on the validation data. This process ensures that the model has the ability to generalize and apply to real-world scenarios. The essence of Random Forest involves creating decision trees, with each decision tree trained on a random subset of the training data.
This helps minimize the risk of overfitting and enhances the model’s accuracy. The number of decision trees (ntree) and the number of variables considered at each split (mtry) are 500 and 12 (12 factors affecting transacted residential land prices), respectively. Once the model is built, its performance is evaluated by predicting on the validation dataset and comparing with the actual results to calculate the model’s accuracy. In this study, the R-squared (R²) value is used to evaluate the model’s fit. The closer this index is to 1, the more meaningful the model is; the closer it is to 0, the weaker the model’s significance.
Where:
is the actual transacted land price;
is the land price predicted by the model;
is the average value of ;
is the average value of ;
n is the number of observations in the validation dataset.
III. RESEARCH RESULTS AND DISCUSSION
- Application of Random Forest Algorithm to Analyze Residential Land Prices in Đà Lạt City
In this study, the Random Forest algorithm was used to analyze residential land prices in Đà Lạt city. The input factors for the model include state-regulated land prices, transaction time, bank interest rates, distance to the administrative center, width of the adjacent road, plot shape, plot orientation, infrastructure, amenities, location coefficient, and planning factors. The factor to be forecasted is the successfully transacted residential land price. The initial dataset was divided into a training set with 70 observations and a validation set with 30 observations. The Random Forest algorithm was used to analyze the model.
Figure 1 shows that when the number of decision trees is still low, the model error is quite high and fluctuates significantly. This means that the model is not stable when the number of decision trees is low. As the number of decision trees increases, the model error gradually decreases and stabilizes. This indicates that increasing the number of decision trees helps improve the model’s accuracy. The model achieved the best performance at 500 decision trees.
Figure 2 explains the influence of input factors on the transacted residential land prices in Đà Lạt city during the study period. The factor of distance to the administrative center has the greatest influence on transacted residential land prices, accounting for 35.1% in land price prediction. This suggests that proximity to the administrative center is the most important factor in determining the value of real estate in Đà Lạt city, as the administrative center usually concentrates essential public amenities and services.
The width of the adjacent road ranks second with 28.9%, highlighting the importance of convenient transportation access or business development potential from the land plot. State-regulated prices are also a significant factor in explaining the model, accounting for 10.2%. The plot area accounts for 5.02% in the analysis model. Bank interest rates at the transaction time and transaction time respectively explain 4.82% and 4.29% of the model, reflecting the influence of financial and market conditions at the time of purchase.
The location coefficient explains 3.4% in determining transaction prices. Factors such as infrastructure (2.56%), planning factors (2.09%), plot shape (2.14%), and amenities (1.48%) have a lower impact on transacted residential land prices. Overall, the Random Forest model highlights the importance of considering multiple factors when valuing residential land. Through this model, one can understand the extent of influence of each factor on transacted residential land prices in Đà Lạt city. The R² value obtained after the model analysis is 0.98, indicating that the forecast model is significant.
Figure 1. Accumulated error chart of the Random Forest model
Figure 2. Explanation of the influence levels of factors affecting transacted residential land prices
Figure 3 shows that the residential land price forecast model performs well with high accuracy, as evidenced by the majority of forecasted values being close to the actual values. Although some errors still exist, they are insignificant compared to the overall results. Maintaining a high correlation coefficient and consistency in the forecast demonstrates that the model is reliable for predicting residential land prices in Đà Lạt city.
Figure 3. Comparison chart between actual transaction values and forecasted values of the model
- Analysis of the Relationship of Factors Affecting Residential Land Prices in Đà Lạt City
2.1. State-regulated Residential Land Prices
Figure 4. The Discrepancy Between Transacted Residential Land Prices and State-regulated Land Prices
State-regulated land prices are the land prices announced by the People’s Committee of Lâm Đồng Province. Figure 4 indicates that market sale prices are generally higher than state-regulated land prices. This demonstrates that market land values tend to be higher than state valuations. This discrepancy can be attributed to various factors such as location, surrounding amenities, and market demand.
Figure 5a reveals the relationship between state-regulated prices and transacted prices, with a correlation coefficient of only 0.16. The data points in Figure 5a are scattered, indicating no strong relationship between state and market prices. This suggests that transacted market prices are not closely dependent on state prices. Other factors such as supply and demand, infrastructure, and specific market conditions of each area may have a greater impact on the actual market sale prices.
Figure 5. Relationship between influencing factors and transacted residential land prices
2.2. Transaction Time
Land prices often tend to fluctuate over time, depending on the economic situation and the real estate market at the time of the transaction. Figure 5b shows the fluctuation in transacted residential land prices during the study period. In 2020, the average transacted land price fluctuated around 7 million VND/m² with a relatively narrow price range. However, in 2021, land prices significantly decreased (below 7 million VND/m²), much lower than in 2020, indicating a less volatile market. In 2022, there was a sharp increase in land prices, with an average transaction price above 15 million VND/m². The price range for residential land expanded significantly, showing great market volatility. By 2023, the average land price increased slightly compared to 2022.
2.3. Bank Interest Rates
Interest rates directly affect the borrowing capacity of buyers, thereby impacting land prices. Figure 5c shows the relationship between transacted residential land prices and bank interest rates. At an interest rate of 5.5%, transacted land prices tended to be low, mainly below 10 million VND/m². Conversely, at interest rates between 7.0% and 7.5%, transacted land prices tended to be higher, with many transactions exceeding 30 million VND/m². This indicates that the distribution of land prices across different interest rate ranges reflects significant fluctuations in the land market.
2.4. Distance to the Administrative Center
Theoretically, the closer the land is to the administrative center, the higher its value due to the convenience of transportation and amenities. Figure 5d shows a negative correlation between transacted land prices and the distance to the administrative center.
As the distance to the administrative center increases, land prices tend to decrease. Specifically, at short distances of less than 500m, land prices are higher and fluctuate significantly, ranging from 10 million VND/m² to over 30 million VND/m², indicating that land close to the administrative center has high value. At distances from 500m to 1000m, land prices gradually decrease, mainly ranging from 5 million VND/m² to 20 million VND/m², reflecting the importance of proximity to the administrative center. When the distance exceeds 1000m, land prices decrease significantly, with many transactions below 5 million VND/m². At further distances (over 1500m), land prices are very low and show little fluctuation.
2.5. Width of the Adjacent Road
Figure 5e describes the relationship between the width of the adjacent road and transacted land prices. The chart shows that the wider the road, the higher the land prices tend to be. Specifically, land adjacent to roads 3 and 4m wide has very low and little fluctuating prices. As the road width increases to 6m, land prices become high with a small degree of fluctuation. On wider roads, such as 9 and 10m, land prices continue to be higher.
Particularly, on roads 11 and 12m wide, land prices reach the highest levels and show significant fluctuation, indicating that land in these areas has high value and a diverse range of sale prices. Thus, it can be seen that land plots located on wide roads generally have higher land prices due to the convenience of transportation and business opportunities.
2.6. Plot Shape
The shape of the land also affects the value of the plot, with regularly shaped plots that are easy to use often being more desirable. Figure 5f illustrates the relationship between plot shape and transacted land prices. According to the chart, rectangular plots have the highest value and the greatest price fluctuation.
The median price of rectangular plots is quite high compared to other shapes, indicating that the average value of these plots is typically higher than others. Meanwhile, polygon-shaped plots have lower prices and less fluctuation. Square plots have more stable prices, with an average price higher than polygon-shaped plots but lower than rectangular ones. Overall, plot shape influences the value and volatility of land prices, with rectangular plots generally having the highest economic value.
2.7. Plot Area
Larger plot areas can determine the transaction value of the land plot. Figure 5g illustrates the relationship between plot area and transacted land prices, showing an inverse relationship between these two variables. Specifically, as the plot area increases, land prices tend to decrease. The trend line on the chart clearly shows a gradual decline in land prices as the area increases, indicating an inverse relationship between area and residential land prices. For smaller plots (below 100 m²), land prices vary significantly, ranging from approximately 5 million VND/m² to over 30 million VND/m². As the plot area increases from 100 m² to 400 m², land prices gradually decrease. For plots larger than 400 m², land prices continue to decrease.
2.8. Plot Orientation
According to feng shui and cultural habits, the orientation of the land can affect its value. Figure 5h illustrates the relationship between plot orientation and transacted land prices, showing a significant impact of land orientation on transaction value. According to the chart, land with a southern orientation has the highest median price. However, land with a northern orientation exhibits the greatest price fluctuation.
Meanwhile, southwestern, eastern, and southeastern orientations have lower median prices than the southern orientation, possibly due to feng shui factors or less favorable geographical positions in Đà Lạt. The western orientation also has a low median price but shows significant fluctuation. Northeastern and northwestern orientations have relatively high median prices and less fluctuation compared to other orientations, indicating greater price stability. Overall, plot orientation significantly influences transaction value, with southern and southwestern orientations having the highest values and the greatest fluctuation.
2.9. Infrastructure
Amenities such as electricity, water, and good roads will enhance the value of the plot. Figure 5i presents the relationship between infrastructure and transacted land prices, comparing two types of infrastructure: areas with concrete roads, full electricity, and water; and areas with gravel roads, full electricity, and water. Land prices in areas with concrete roads, full electricity, and water have a significantly higher median than areas with gravel roads.
Specifically, land prices in areas with good infrastructure range from approximately 4 million VND/m² to 35 million VND/m², with most transactions concentrated between 10 million VND/m² and 20 million VND/m². In contrast, land prices in areas with gravel roads, full electricity, and water are much lower and show less price fluctuation. Overall, areas with concrete road infrastructure have better land prices than those with gravel roads.
2.10. Amenities
Proximity to schools and markets also increases the value of the land plot. Figure 5j presents the relationship between amenities and transacted land prices. This study compares two areas: near schools and markets, and far from schools and markets. The chart shows that land prices in areas near schools and markets have a significantly higher median than areas far from these amenities.
Specifically, land prices near schools and markets range from approximately 8 million VND/m² to 35 million VND/m², with most transactions concentrated between 15 million VND/m² and 23 million VND/m². In contrast, land prices in areas far from schools and markets are lower and less volatile, ranging from approximately 2 million VND/m² to 16 million VND/m². It can be seen that land plots near schools and markets have higher transaction prices than those far from schools and markets.
2.11. Location Coefficient
The location coefficient is a factor that adjusts the value of the land plot based on its specific location within an area. Figure 5k illustrates the relationship between the location coefficient and transacted land prices, showing a positive relationship. As the location coefficient increases, land prices also tend to increase, as indicated by the upward-sloping trend line in the chart. At a low location coefficient (below 0.4), land prices exhibit significant fluctuation and are mostly at low levels. As the location coefficient increases from 0.4 to 0.8, land prices start to rise, which is reflected by the even distribution of data.
Particularly, at a high location coefficient (close to 1), land prices reach the highest levels. This indicates that locations with high coefficients are often in prime, convenient areas with higher economic potential. Significant fluctuations at low-coefficient locations may be due to differences in geographical conditions and infrastructure. Overall, the location coefficient is an important factor affecting land value, with higher coefficients leading to higher and more concentrated land prices.
2.12. Planning Factors
Planning factors also significantly affect the value of a land plot. Figure 5l presents the relationship between planning factors (urban residential land and rural residential land) and transacted prices, showing a clear difference in land values between these two types of planning. Urban residential land has a higher median price compared to rural residential land and exhibits greater price fluctuation, with a price range from approximately 2 million VND/m² to 35 million VND/m².
Most urban residential land transactions are concentrated between 12 million VND/m² and 23 million VND/m². In contrast, plots planned as rural residential land have lower prices and less fluctuation, with a price range from 2 million VND/m² to 26 million VND/m² and most transactions concentrated between 7 million VND/m² and 17 million VND/m². Overall, planning factors as urban residential land make plots more valuable compared to planning factors as rural residential land.
CONCLUSION
This study indicates that the residential land price forecast model using the Random Forest algorithm and 12 input factors, including state-regulated land prices, transaction time, bank interest rates, distance to the administrative center, width of the adjacent road, plot shape, plot orientation, infrastructure, amenities, location coefficient, and planning factors, achieves good forecasting performance. Factors such as distance to the administrative center, width of the adjacent road, and state-regulated prices can explain 74.4% of the residential land price forecast. This study suggests that the Random Forest forecast model can be used to value residential land.
REFERENCES
- Lê Thị Hân, Nguyễn Thế Bính, and Bùi Đan Thanh (2022), Analysis of Factors Affecting the Growth of the Vietnamese Real Estate Market, Tạp chí Tài chính.
- Nguyễn Mạnh Hùng (2022), The Vietnamese Real Estate Market: Current Situation and Solutions, Trường Đại học Cần Thơ.
- Nguyễn Hoàng Huy (2021), Mobilizing Resources to Develop Đà Lạt City into a Knowledge City, Trường Đại học Kinh tế Thành phố Hồ Chí Minh.
- Nguyễn Hữu Ngữ and Dương Quốc Nõn (2017), Textbook on Land Valuation, Nhà xuất bản Đại học Huế.
If you need more consulting, please Contact Us at NT International Law Firm (ntpartnerlawfirm.com)
You can also download the .docx version here.
“The article’s content refers to the regulations that were applicable at the time of its creation and is intended solely for reference purposes. To obtain accurate information, it is advisable to seek the guidance of a consulting lawyer.”
![](https://ntpartnerlawfirm.com/wp-content/uploads/2023/04/pl.jpg)
LEGAL CONSULTING SERVICES
090.252.4567NT INTERNATIONAL LAW FIRM
- Email: info@ntpartnerlawfirm.com – luatsu.toannguyen@gmail.com
- Phone: 090 252 4567
- Address: B23 Nam Long Residential Area, Phu Thuan Ward, District 7, Ho Chi Minh City, Vietnam