Spatial clustering analysis of COVID-19 using LISA: A case study of the 2022 winter Day in Hanoi, Vietnam

Background: A novel coronavirus disease outbreak in 2019 (COVID-19) caused by he emergence of severe acute respiratory syndrome coronavirus 2 (SARS CoV 2) in China quickly spreads throughout the world. This study aimed to analyze the spatial clustering of COVID-19. Methods: The global and local Moran's I statistic (LISA) was used to investigate the spatial clusters of COVID-19 including spatial clusters (high-high and low-low) and spatial outliers (low-high and high-low). Results: A case study of COVID-19 locally transmitted cases reported in a 2022 winter day in Hanoi city has indicated that high-high spatial clusters were totally concentrated in 6 urban districts in the Hanoi metropolitan including such as districts of Dong Da, Gia Lam, Thanh Tri, Hai Ba Trung, Cau Giay, and Long Bien. Whereas, low-low spatial clusters were mainly in sub-urban districts such as Ba Vi, Thach That, Phuc Tho, and Son Tay town (0 cases) in the northwest and Ung Hoa district in the south of Hanoi.


Introduction
The COVID-19 pandemic has been a serious global danger to public health since early 2020. SARS-CoV2, the coronavirus that causes severe acute respiratory syndrome, is the cause of COVID-19 (1). The virus spread fast to additional nations in eastern Asia, Europe, and the rest of the world (2). As of 16 August 2023, the World Health Organisation had received reports of more than 769.9 million confirmed cases of COVID-19, including more than 6.9 million deaths (3). It is, therefore, attempts have been made to the study of the COVID-19 pandemic, particularly spatial clustering of COVID-19 (4,5).
Geographical factors play crucial roles in the fight against the COVID-19 pandemic (6). When the COVID-19 pandemic were discovered to be thoroughly spatial in nature (7). Lots of efforts have been made to study the COVID-19 pandemic from a geographical perspective such as the assessment of the effects of living environment deprivation on COVID-19 hotspot in Kolkata megacity, India using Getis-Ord G statistic and geographically weighted principal component analysis (8). Exploratory spatial data analysis and the geodetector method have been also employed to analyse the spatial and temporal differentiation characteristics and the influencing factors of the COVID-19 epidemic spread in mainland China (9). Also in China, spatial autocorrelation analysis was used to investigate the spatial clustering characteristics of the COVID-19 pandemic in Beijing (10). In addition, spatial autocorrelation analysis has been also successfully to COVID-19 studies in many countries such as to investigate the spatiotemporal interaction effect of COVID-19 transmission in the United States (11), to identify long-term exposure to air-pollution and COVID-19 mortality in England (12), to spatial autocorrelation patterns across five waves of COVID-19 in Catalonia, Spain (13), and to analyse spatio-temporal COVID-19 outbreak in Italy (14).
This study aimed to analyze the spatial clustering of COVID-19. The global and local Moran's I statistic (LISA) was used to investigate the spatial clusters of COVID-19 in Hanoi city. The spatial clusters including spatial clusters (high-high and low-low) and spatial outliers (low-high and high-low) will be identified using the first order and second of contiguity.

Material
In this study, a dataset of COVID-19 cases in the 2022 Winter Day in Hanoi was used to analysis of spatial clusters. These COVID-19 locally transmitted cases were reported on 31 January 2022. The spatial distribution of these COVID-19 cases is shown in Figure 1. Data from Figure 1 illustrate that the COVID-19 cases were mainly reported in the northeast of Hanoi city. In particular, high numbers of COVID-19 case were mainly confirmed in the Hanoi metropolitan where the population density is dense. Data from Table 1 shows that high numbers of COVID-19 cases were reported mostly in urban districts such as Dong Da (298 cases

Methods
This study employed global Moran's I statistic to identify the spatial clustering of the COVID-19 pandemic at global scale (15,16). The definition of the global Moran's I statistic is expressed in equation (1): ………. (1) where and are the number of new COVID-19 confirmed cases for district i and district j; ̅ is the mean of COVID-19 cases and be given by ̅ = ∑ =1 ; n is the total number of districts in the whole study area; and is a ( × ) spatial weight matrix (5).

The range of values of global Moran's I coefficient is in the interval [-1, +1] (5). Positive values of Moran's I result from
the data's positive spatial autocorrelation., whereas Moran's I values are negative when there is a negative spatial autocorrelation (17). The absence of spatial autocorrelation or random COVID-19 epidemic distribution is shown by values of the global Moran's I coefficients that are near to zero.
The global Moran's I reflects the presence or lack of spatial autocorrelation as a whole. The regional Moran's I statistic was used to quantify the spatial clustering of low and high COVID-19 pandemic levels in each district (5). The local Moran's I statistic ( ) for COVID-19 pandemic at district i is given by the following equation (18): (2) where , , ̅ , and are defined in equation (1); is the total number of neighborhood districts (5); denotes the neighborhood set of COVID-19 confirmed cases at district i; # implies that the sum of all ( − ̅ ) of nearby neighbourhood districts of district i but not including ; and 2 is the variance of , given in equation (3). defines neighbor connectivity and can be constructed using first order and second of contiguity ( Figure 2).
The level of spatial clustering of the COVID-19 pandemic at each district is indicated by local Moran's I statistic. Similar to the global Moran's statistic, the local Moran's value at district i ( ) also ranges between -1 and +1. There is no spatial autocorrelation of COVID-19 casses if the local Moran's I coefficient at district i equals zero ( = 0). If > 0 then there will be a positive spatial autocorrelation of COVID-19 cases (5). If < 0 then there will be a negative spatial autocorrelation of COVID-19 cases. A high positive Ii shows the district i has a similarly high or low number of COVID-19 cases as its neighbors and called the ''spatial cluster'' (17). In this case, when there is a positive local spatial autocorrelation, the local Moran's I statistic indicates two types of spatial clusters for COVID-19 cases, including: highhigh spatial clusters and low-low spatial clusters. Low-high and high-low clusters are also two forms of spatial outliers that are identified using the local Moran's I statistic when there is a negative local spatial autocorrelation. In this work, with the help of the spatial statistics software, GeoDA, developed by (19), a randomization test was used to test the significance of spatial autocorrelation statistics. Spatial autocorrelation statistics were generated and tested at the significance of 0.05 using 999 permutations.

Spatial distribution of the COVID-19 cases
A total of 2638 new COVID-19 cases were detected in the community in Hanoi city. The data was summarized in Table  1. The spatial distribution of new COVID-19 cases was shown in Figure 2. Data from Table 1 and Figure 2 shows that the high number of cases was mainly concentrated in urban districts such as in the districts of Dong Da (298 cases), Hoang Mai (266 cases), Ha Dong (260 cases), and Nam Tu Liem (196 cases). Areas with a low number of new cases were concentrated in suburban districts such as Thach That and Phuc Tho, Ba Vi and My Duc (0 cases), Chuong My (1 case), and Quoc Oai (2 cases).

Analysis of spatial clustering of new COVID-19 cases
The Moran scatter plots in Figure 3 shows the degree of globally spatial autocorrelation of COVID-19 cases in the case of using two different types of spatial weight matrices. The global Moran's I values were 0.244 and 0.413 corresponding with the use of the first order and second order of contiguity method, respectively. This result shows that the first order method produces results with higer correlation coeffient when comparing with that of the first order method. This shows a higher degree of spatial autocorrelation between COVID-19 cases for method of the first order of contiguity. In the analysis of spatial clustering, both methods were used to idenfity LISA of COVID-19 cases. Using the first order of contiguity, spatial clusters of COVID-19 identified by LISA is shown in Figure 4. Data from Figure  4 (left) shows that there were six high-high clusters, three low-low spatial clusters, no low-high and high-low spatial outliers, and 19 districts and town with unsignificant spatial clustering. In which, six high-high clusters were mainly distributed in the central area of Hanoi city where the population density is dense. Three low-low spatial clusters were distributed in the northwest and south of the city in rural areas with lower population densities. Data Figure 4 (right) and table 1 illustrate that the high-high and low-low spatial clusters were statistically significant with at the level of 0.05, of which five spaital clusters at the significant level of 0.05, four spatial clusters clusters at the significant level of 0.01 and only two spatial clusters at the significant level of 0.001. Both the left and right images in Figure 4 show that high-high spatial clusters were statistically significant at level of 0.01. These high-high spatial clusters were totally  The spatial cluster determined by the second order method is shown in Figure 5 and also summarized in Table 1. Figure  5 data (left) shows that there are six high-high clusters, three low-low clusters, and no low-high and and high-low cluster, and 19 districts with statistical unsignificance. In which, six high-high clusters were also mainly distributed in the central area of Hanoi city with high population density is dense. Five low-low clusters were distributed in the northwest areas of the city in the suburban area of Hanoi. Data from Figure 5 (right) and Table 1 show that the nine high-high and low-low clusters were statistically significant at the level of 0.05, in which three clusters at the sinificance of 0.01, one cluster at the significance of 0.001. Almost high-high spatial clusters were statistically significant at the level of 0.01. The spatial distribution of the LISA map shows that high-high spatial clusters were still concentrated in innercity districts where most of COVID-19 cases were confirmed January 31, 2022, including Dong Da (298 cases), Hoang Mai (266 cases), Hai Ba Trung (127 cases), Cau Giay (116 cases), Ba Dinh (107 cases) and Hoan Kiem (95 cases). Three low-low spatial clusters were in districs of Ba Vi and Son tay town (0 cases), and Quoc Oai (2 cases) in the northwest of Hanoi.

Figure 5
Spatial clustering of new COVID-19 cases using second order of contiguity LISA map (left) and significant map (right)

Conclusion
In this study aimed to analyze the spatial clustering of COVID-19. The global and local Moran's I statistic (LISA) was used to investigate the spatial clusters of COVID-19 including spatial clusters (high-high and low-low) and spatial outliers (low-high and high-low). It was found from a case study of COVID-19 locally transmitted cases in a 2022 winter day in Hanoi city has indicated that high-high spatial clusters were totally concentrated in 6 urban districts in the Hanoi metropolitan including such as districts of Dong Da, Gia Lam, Thanh Tri, Hai Ba Trung, Cau Giay, and Long Bien. Whereas, low-low spatial clusters were mainly in sub-urban districts such as Ba Vi, Thach That, Phuc Tho, and Son Tay town (0 cases) in the northwest and Ung Hoa district in the south of Hanoi. This study has demonstrated the effective use of LISA in the investigation of COVID-19 spatial clustering. The results of this investigation have a significant impact on the fight against the COVID-19 pandemic.