logo

Models of Semivariograms 📂Statistical Analysis

Models of Semivariograms

Overview

In Spatial Statistics Analysis, if a Spatial Process is Isotropic and the Semivariogram satisfies $\gamma \left( \left\| \mathbf{h} \right\| \right) = \gamma (d)$, then $\gamma$ can be expressed not as a complex matrix form but as a one-dimensional scalar function, that is, $\gamma : \mathbb{R} \to \mathbb{R}$. This means that the correlation between point reference data $Y(s), Y(s + d)$ can be plotted as a line graph.

Models 1

(Following the post on Isotropy of Variograms)

Variogram graphs can be categorized into several types depending on the characteristics of the data.

Setting aside how these are interpreted, the method of drawing such a picture simply involves placing the distance $Y(s), Y(s+d)$ between the data $d$ on the x-axis and $\gamma (d)$ on the y-axis. From the fact that it is represented as a picture in the first place, it can be confirmed that calling $\gamma$ a ‘variogram’ is a natural naming.

In actual data, there may not be many data pairs that exactly correspond to a certain length $d$, so it can be empirically obtained by dividing into partitions. The above figure is an example of drawing a variogram in Julia, showing not only the semivariogram but also the frequency of the respective class according to $h = d$2.

Equations

Several functions are known as models for fitting variograms.

Here, we will briefly comment on a few important models and move on:

  1. Linear: A model in which influence is determined proportional to distance. At first glance, it seems plausible, but it’s actually not used due to difficult interpretation with the covariogram.
  2. Spherical: A model in which influence completely disappears beyond a certain distance. It is a reasonable choice for many data sets.
  3. Exponential: The simplest and most understandable model where influence decreases exponentially as distance increases. It’s sufficient for undergraduate level projects.
  4. Matérn: Among the given models, it can be used in a wide variety of data due to its inclusion of $\tau^{2}$ which relates to the y-intercept, $\sigma^{2}$ that determines scale, and $\phi$ to $\nu$ that affect the shape itself. If Exponential is the simple go-to, then Matérn is considered the strongest and most widely used model. The $K_{\nu}$ appearing in equations refers to the Modified Bessel function of the first kind.

Now let’s look at how to read a semivariogram through equations. Before that, it is good to remember the following equation, where $\gamma$ and $C$ have a trade-off relationship, and since $C$ represents covariance, a high value of $\gamma$ means that the relationship between the data decreases. $$ \operatorname{Var} Y = \gamma ( \mathbf{h} ) + C ( \mathbf{h} ) $$

Generally, variograms often depict a shape where $t$ increases and then $\gamma (t)$ also increases until it no longer grows after a certain point. Intuitively, this describes that as the distance increases, the relevance between the data decreases until it becomes particularly unrelated beyond a certain distance.

Nugget

$$ \text{Nugget} := \gamma \left( 0^{+} \right) = \lim_{t \to 0+} \gamma (t) = \tau^{2} $$ Among the known models, $\tau^{2}$ corresponds to the value known as Nugget,

  • Theoretically, since $\operatorname{Var} Y = \gamma ( \mathbf{h} ) + C ( \mathbf{h} )$, at $\mathbf{h} = 0$ it should be $C ( 0 ) = \operatorname{Var} Y$, but
  • When handling actual data, there is practically no meaning at exactly data $\left| \mathbf{h} \right| = 0$, and even very close points show some differences.

Thus, the y-intercept emerging contrary to theory is called a Nugget.

Sill

$$ \text{Sill} := \lim_{t \to \infty} \gamma (t) = \tau^{2} + \sigma^{2} $$ Among the known models, the value corresponding to $\tau^{2} + \sigma^{2}$ is called the Sill of $\gamma (t)$. Depending on the model, it may not converge theoretically, but if we consider a suitable tolerance $0.05$ as ‘having reached the ceiling,’ that point can be called the Sill. Especially, the height between the Sill and Nugget $\sigma$ is referred to as the Partial Sill.

Range

The point up to which $\gamma (t)$ first touches the Sill is referred to as the Range. Particularly, if a tolerance is given when determining the Sill, it is also called the Effective Range.


  1. Banerjee. (2015). Hierarchical Modeling and Analysis for Spatial Data(2nd Edition): p24~29. ↩︎

  2. https://juliaearth.github.io/GeoStats.jl/stable/variography/empirical.html#Variograms ↩︎