Spatial statistical methodologies in COVID-19 Studies: A systematic review

Objective: As of August 2023, COVID-19 had claimed 7 million lives, making it the pandemic with the highest mortality rate. Therefore, The use of cutting-edge technologies and methods is essential when battling the COVID-19 epidemic. This paper aims to systematically review and synthetize applications of spatial statistical methodologies in the analysis of COVID-19. Material and Methods: 55 articles in total were screened from four main digital databases including Web of Science, SCOPUS, PubMed/MEDLINE, and Google schoolar. Three distinct concerns with the use of spatial statistical techniques in the analysis of COVID-19 are discussed, namely (i) applications of spatial regressions in the evaluation of COVID-19's effects, (ii) COVID-19 mapping using of hotspots and spatial clustering analyses, and (iii) applications of interpolation and geostatistics on COVID-19 studies, respectively. Results: Spatial regressions can support the assessment of the COVID-19 impacts on social-economy and environment. Whereas, hotspots and spatial clustering analysis can help effectively on COVID-19 mapping. Last but not least, geostatistics and interpolation are crucial for predicting COVID-19. Conclusion:


Introduction
The World Health Organisation (WHO) identified the group of unusual pneumonia cases in Wuhan, China, in December 2019 as Coronavirus disease 2019 (COVID-19) on February 11, 2020 (1).The COVID-19 pandemic has ravaged the world, killing hundreds of millions of people while crippling economies, blocking borders, and unleashing devastation on a never-before-seen scale (2).According to the latest data reported by WHO, as of August 2023, the COVID-19 pandemic has caused over 768.9 million confirmed cases and 6.9 million deadths worldwide (3).Due to the COVID-19 outbreak, which spread quickly, there have been losses in the tourism, aviation, agricultural, and financial industries.Additionally, governments around the world have been forced to drastically reduce both supply and demand in the economy (4).It is therefore, a lots of attempts on the application of advaced technologies and techniques have been made to contribute to the fight against the COVID-19 pandemic.Spatial statistical methodologies have mainly been used to study natural resouces (5), environment (6) and climate change (7).In addition, many studies utilizing spatial statistical methodologies in socioeconomic studies were also reported (8,9).With the wide range of applications of spatial statistical methodologies, they have been utilized in studies of medicine (10), epidemiology (11), and health science (12).Particularly, since the COVID-19 epidemic has spread around the world, many attempts have been made to research the pandemic using spatial statistics.However, there hasn't been much focus on summarising the use of spatial statistics in COVID-19 investigations.It is, therefore, this research attempts to comprehensively review and synthesise spatial statistical methodology applications in the investigation of COVID-19 pandemic.Specifically, the content is presented under three sub-sections; namely (i) applications of spatial regressions in the assessment of the impacts during the COVID-19, (ii) COVID-19 mapping using of hotspots and spatial clustering analyses, and (iii) applications of interpolation and geostatistics on COVID-19 studies, respectively.

Materials
In the beginning, four digital databases, Web of Science, SCOPUS, PubMed/MEDLINE, and Google Scholar, were mostly used to gather scientific papers.The study used a total of 55 papers that met the inclusion requirements and were categorised into three major themes.These studies were chosen due to their high amount of citations and relevance to COVID-19 and spatial statistical techniques.They were primarily published in recent years during the COVID-19 epidemic.

Methods
First, three separate themes were used to query four digital databases, including Web of Science, Google Scholar, PubMed/MEDLINE, and SCOPUS.For the first theme, namely the use of spatial regressions to evaluate the effects during COVID-19.Numerous different keyword combinations were utilised, including "application," "the use," "spatial regressions," "assessment," "impacts," "COVID-19," "SARS-CoV-2," and "review" or "overview."Combinations of keywords like "application," "the use," "hotspots," "spatial clustering," "autocorrelation," "COVID-19," or "SARS-CoV-2" have been used in connection with applications of hotspots and spatial clustering analysis on COVID-19 mapping.And last, we come to the final theme, which is the use of geostatistics and interpolation in COVID-19 research.'Application', 'the use', 'interpolation, 'geostatistics, 'Kriging , 'CoKriging ', 'COVID-19', and 'SARS-CoV-2' were among the keyword combinations that were employed.The uses of spatial statistical approaches in the investigation of the COVID-19 pandemic were finally summarised and addressed based on three distinct subtopics.

Applications of spatial regressions in the evaluation of COVID-19's effects
Table 1 shows an overview of applications of spatial regressions in the evaluation of COVID-19's effects.Where, one of the first and most widely used techniques in regression analysis to find meaningful correlations between the dependent and independent variables is ordinary least squares (OLS) regression (13).In several investigations, the OLS model has been used extensively to examine the correlations between COVID-19 and influencing factors.For instance, based on multivariate logistic regression analysis (MLR), statistical analysis of the COVID-19 outbreak's impacts on human sexual behaviour was carried out using OLS (14).Later, constructed around the notion of Tobler's first law of geography -that "everything is related to everything else, but near things are more related than distant things" (15), spatially combined autoregressive models (SAC), which incorporate the above models, have also been used to examine the effects of COVID-19 on air transportation by simultaneously taking into account spatial lag and spatial error parameters (16).
Geographically weighted regression (GWR), which is based on the spatial autocorrelation analysis, is another often applied technique in the regression model to examine the effects of COVID-19.For instance, the COVID-19 pandemic spreading in Irag was examined using the GWR model to determine the spatial distribution of the epidemic spread and the contribution of the physical, social, and economic aspects (17).The impact of sociodemographic factors on the incidence of COVID-19 in 342 Chinese cities was effectively investigated using a local spatially weighted regression model analysis (18).It was found in the study of (18) sociodemographic factors may have an impact on the incidence of COVID-19.The GWR model was used in a study on Texas Counties to identify the sensitive factors that affect COVID-19 mortality, including population and hospitalization, adult population, natural supply, economic condition, air quality or medical care (19).In addition, GWR has also been applied to research the associations between between the disease and air quality and a variety of socioeconomic factors (13).
In order to investigate the relationship between reductions in Particulate Matter (PM2.5 and PM10) air pollution and influencing factors during the COVID-19 outbreak, a multi-scale geographically weighted regression (MGWR) model was successfully applied (20).The results obtained from MGWR revealed that the two socio-economic factors had more significant impacts than meteorological factors (20).Using MGWR, it was determined that land use and pollution have a significant impact on the distribution of COVID-19 in both the impacted areas of the Po Valley in Italy and Wuhan in China (21).The spatially non-stationary relationships between sociodemographic determinants and COVID-19 incidence rates were also investigated in Oman using MGWR, and it was discovered that the population density, the proportion of elderly people aged 65 and over, hospital beds, and the prevalence of diabetes were statistically significant determinants of COVID-19 incidence rates (22).Recently, methods based on principal component analysis called geographically weighted regression principal component analysis (GWPCA) that take into account spatial heterogeneity were developed for usage with geographic data in COVID-19 studies (23).Later, the GWPCA method was effectively applied to investigate the effect of a poor living environment on a COVID-19 hotspot in the megacity of Kolkata, India (24).

COVID-19 mapping using of hotspots and spatial clustering analyses
Another spatial statistical method that has been commonly employed in COVID-19 studies is hotspot and spatial clustering analysis.The results obtained from the preliminary analysis of the applications of hotspots and spatial clustering analysis for COVID-19 mapping are summarised in Table 2.
The Getis-Ord   * and Moran's I are two of the most popular statistics to assess local hostpot and spatial clustering (13).The Getis-Ord   * statistic-based hotspot analysis is used to to measure the degree of spatial autocorrelation among spatial objects in a geographic unit, or similarity between spatial objects in close-by geographic units (25).Whereas, local Moran's I statistics can represents how much the COVID-19 epidemic is spatially clustered.in a geographic unit (26).The spatial distribution of COVID-19 cases has been extensively studied using Getis-Ord   * and Moran's I statistics.For instance, Moran's I and Getis-Ord   * statistics were used to examine spatio-temporal clustering patterns and to identify sociodemographic factors associated with COVID-19 infections in Helsinki, Finland (27).The results showed that high-high clusters and high relative risk areas emerged primarily in Helsinki's eastern neighbourhoods, which are socioeconomically vulnerable, with a few exceptions revealing local outbreaks in other areas (27).With the help of global and local Moran's I statistics in the analysis of the spatio-temporal COVID-19 transmission and its influencing factors, a study in China revealed that the global and local spatial correlation characteristics of the epidemic distribution were positively correlated (28).In Vietnam, a study on the spatiotemporal distribution of COVID-19 in Vietnam over the first seven months of the outbreak was carried out by means of the local Moran's I statistic, where this study found a spatial cluster in Vinh Phuc province's initial phase (29).Later, using a dataset of 10,742 locally transmitted cases collected from four COVID-19 waves in 63 prefecture-level cities and provinces in Vietnam, the local Moran's I spatial statistic and Moran scatterplot were also successfully used to identify high-high and low-low clusters and low-high and high-low outliers of COVID-19 cases (30).
Influence of human and environments on the COVID- Mapping COVID-19 mortality and morbidity risk Europe, Mediterranean Mapping COVID-19 transmission risks Brazil ( The global univariate Moran's I approach in COVID-19 studies is the most popular alternative to the local Moran's I, albeit it has primarily been employed with socioeconomic data (13).With the advantage of spatial autocorrelation, it has been commonly employed in studies of health sciences.For example, the global univariate Moran's I statistic has been widely used to investigate space-time patterns of the COVID-19 pandemic in São Paulo State, Brazil (31), to investigate the socioeconomic elements that have an impact on the COVID-19's spatial spread the United States (32), to determine how the combination of China's human and environments influence the spread of COVID-19 (33).Additionally, the global Moran's I statistic was also ultilized for mapping COVID-19 vulnerability and risks.For instance, it was applied to the 2020 COVID-19 geographical analysis of mortality and morbidity risk in Europe and the Mediterranean (34) and the identification of COVID-19 transmission risk clusters in northern Brazil (35).

Applications of interpolation and geostatistics on COVID-19 studies
The use of interpolation techniques (Inverse Distance Weighting, Voronoi, or Cubic spline technique), as well as geostatistics, in the analysis of COVID-19, will be thoroughly discussed in this paper.Whereas, to address the spatial and spatio-temporal patterns of the pandemic, the association with atmospheric themes, the environment, as well as socioeconomic factors, interpolation has been a widely employed technique (13).Special attention the use of interpolation and geostatistics has been paid in numerous studies for prediting COVID-19 during the severe lockdowns in many countries, particularly in countries hit hardest by COVID-19.Table 3 provides the results obtained from the preliminary analysis of applications of interpolation and geostatistics on COVID-19 studies.
Inverse Distance Weighting (IDW) is one of the most commonly used interpolation method.İn COVID-19 studies, the COVID-19 lockdown's effects on community movement in several Indian states were examined in one of the first studies to employ the IDW approach, where IDW was used to compare pre-and post-lockdown mobility trends as a result of COVID-19 (36).In Indonesia , the impact of the COVID-19 epidemic on ambient air quality in the Yogyakarta Urban Area parameters including SO2, CO and, NO2 was also successfully evaluated with the aid of the IDW technique (37).In Iran, total of six geostatistical models were used to analyse association patterns when examining socioexposomic connections with COVID-19 outcomes across New Jersey (49).Geostatistical methods were also utilized to detect COVID-19 hotspots in the 2020 North Carolina general election in the US (50).Later, to determine the probable change in emissions in a post-COVID period, a cubic spline technique was successfully utilised to predict the second-by-second speed of buses and taxis through vehicle GPS data devices (51).
Among geostatistical methods, Kriging is the most widely applied geostatistical method in studies of not only management of narual resources and climate change, but also of the COVID-19.Kriging has been used in the study of COVID-19 to predict COVID-19 cases and climatic-environmental variables.For example, Kring was used for predicting COVID-19 cases in the US (52) and predicting spatiotemporal change of cumulative Covid-19 cases in India (53).In the latter case, Kriging was successfully utilised to evaluate the spatial fluctuations of air quality when investigating the relationship between air pollution and the COVID-19 pandemic in Mumbai, India (54).In addition, Cokriging is another geostatistical technique that has been used in COVID-19 research.For example, the spatial distributions of PM 2.5 and benzene were analysed via Cokriging in 2018-2019 and 2020, respectively, in Almaty, Kazakhstan (55).

Conclusion
This work is a synthesis on uses of spatial statistical methodologies in COVID-19 studies.The study reviewed 55 articles collected from Web of Science, SCOPUS, PubMed/MEDLINE, and Google schoolar digital databases.The content is presented under three sub-sections; namely the applications of spatial regressions in the evaluation of COVID-19's effects, COVID-19 mapping using of hotspots and spatial clustering analyses, and applications of interpolation and geostatistics on COVID-19 studies, respectively.The study results showed that spatial regressions can effectively support the assessment of the COVID-19 impacts on social-economy and environment.Whereas, hotspots and spatial clustering analysis can help effectively on COVID-19 mapping.Last but not least, interpolation and geostatistics are crucial for predicting COVID-19.Findings in this review not only highlights the crucial roles of spatial statistical methodologies in COVID-19 studies, but also provided an insight into how spatial statistics effectively can be applied in combating the COVID-19 epidemic.

Table 1
Spatial regression methods used in the evaluation of COVID-19's effects

Table 2
Hotspots and spatial clustering methods used for COVID-19 mapping