What is Spatial Data Analysis?
Explanation 1
Spatial Data refers to data that includes information about space, and Spatial Statistics is a branch of statistics that analyzes Euclidean space $\mathbb{R}^{r}$ as ‘space’ in the true dictionary sense. While time series analysis analyses data that changes over the time axis $t$, spatial data analysis analyses data that changes depending on the given $D \subset \mathbb{R}^{r}$, (usually when $r = 2$) location.
Even at first thought, the variety of data is diverse compared to time series data since the axis describing the data increases to $r > 1$. Spatial data can fundamentally be classified into the following three main types.
Point Reference Data
Point-referenced Data assumes that points $s \in Y$ at fixed $D \subset \mathbb{R}^{r}$ locations change continuously and represents most data given coordinates as a random vector $Y(s)$. The example above shows concentrations measured at PM2.5 monitoring stations displayed on a map according to coordinates.
Point reference data is also called Geostatistical Data.
Areal Data
Areal Data, like point reference data, has fixed $D \subset \mathbb{R}^{r}$ but differs as it is divided into finite partitions within it. Point reference data might seem universally applicable, but it expresses administrative divisions defined by human society instead of coordinates, such as city, county, ward, town, and neighborhood. The example above shows poverty levels based on irregularly shaped partitions rather than coordinates.
Areal Data, when the $D$ partition is regular in shape, meaning cut neatly and uniformly unlike the example, is also referred to as Lattice data.
Point Pattern Data
Point Pattern Data refers to data for which the $D \subset \mathbb{R}^{r}$ itself is random, unlike the previous two types. Especially for all $s \in D$ when $Y(s) = 1$, point pattern data will only convey the fact that an event occurred at each location. The example above is a commonly referenced illustration when explaining survivorship bias, indicating which parts of American fighter planes returning from World War II were damaged.2 In this case, the locations of damage aren’t predetermined like monitoring stations or administrative divisions but change, and the ‘damage’ is the event in point pattern data.
Banerjee. (2003). Hierarchical Modeling and Analysis for Spatial Data: p16~18. ↩︎
https://www.andrewahn.co/silicon-valley/survivorship-bias/ ↩︎