Doctor of Philosophy, University of Toledo, 2021, Spatially Integrated Social Science
This dissertation bridges the gap between spatial econometrics and machine learning under the theoretical banner of spatial data science. Methodologically, it uses the spatial error model, spatial lag model, and the randomForest algorithm in order to predict Human Development Index (HDI) values within Morocco at the commune scale. This prediction task is done using the Moroccan censuses of 2004 and 2014. The results of this process show that randomForest can outperform the traditional spatial econometric models in terms of numeric accuracy within this specific case.
Since spatial thinkers are just as concerned with spatial accuracy as they are with numeric accuracy, post-estimation procedures were developed in order to assess the spatial accuracy of the spatial error model, spatial lag model, and randomForest in the Moroccan case. These post-estimation procedures were developed for both the global and local levels. In both cases, it is shown that randomForest outperforms both of the spatial econometric models in terms of spatial accuracy within the Moroccan case.
With the Morocco specific results complete, the dissertation moves to simulated data experiments in order to assess different properties of randomForest vs. the spatial lag model, and randomForest vs. the spatial error model. The simulation experiments are carried out using five different data generation processes. Throughout the experiments bias, consistency, efficiency, and spatial prediction performance are evaluated and compared. These experiments show that when either the spatial lag model or spatial error model are the correct model specification, randomForest is unable to outperform either of them in terms of bias, consistency, efficiency, or spatial prediction performance. Therefore, it is concluded that if randomForest does outperform the traditional spatial econometric models, as happened in the Moroccan case, neither the spatial lag model nor the spatial error model are the correct m (open full item for complete abstract)
Committee: Oleg Smirnov Dr. (Advisor); Neil Reid Dr. (Committee Member); Sujata Shetty Dr. (Committee Member); David Nemeth Dr. (Committee Member); Jack Kalpakian Dr. (Committee Member)
Subjects: Geographic Information Science; Geography