In Spatial Data Analysis, for a Random FieldY=(Y(s1),⋯,Y(sn)) that follows a Multivariate Normal Distributionε∼Nn(0,Σ) with Meanμ∈R and Covariance MatrixΣ∈Rn×n, the following model
Y=μ1+ε
is used to estimate ■code.■ at a new ■code. Site ■code.■ s0. This estimated value is called the Ordinary Kriging Estimate. The act of setting up this model for estimation is also referred to as Kriging.
Universal Kriging
X:=11⋮1X1(s1)X1(s2)⋮X1(sn)⋯⋯⋱⋯Xp(s1)Xp(s2)⋮Xp(sn)
For an independent variable X1,⋯,Xp, when using the following Design Matrix and the Vector of Regression Coefficientsβ=(β0,β1,⋯,βp) for a Multivariate Normal Distributionε∼Nn(0,Σ), the following model
Y=Xβ+ε
is used to estimate ■code.■ using ■code.■ at a new ■code. Site ■code.■ s0. This process is called Universal Kriging.
Universal Kriging, simply put, is a type of Kriging that incorporates Multiple Regression Analysis into Kriging. In Ordinary Kriging, μ plays the same role as β0 in Universal Kriging. If we compare it to Time Series Analysis, it can be seen as a Dynamic Regression Model, where we consider not only the spatial structure but also other variables that can explain the variable of interest Y.
If we look at the equations, the regression part might seem a bit different, but if we move Xβ to the left-hand side to make
(Y−Xβ)=0+ε
it’s essentially the same as Ordinary Kriging where the mean vector is a zero vector μ=0.
Formulas
Let’s say the Random Field{Y(sk)}k=1n is an Intrinsic Stationary Spatial Process, and we want to predict a new point denoted as s0. Let’s define Vectorγ∈Rn according to Variogram2γ as follows. For some vector λ=(λ1,⋯,λn), the Best Linear Unbiased Predictor (BLUP) of ■code.■ is
λTY=[λ1⋯λn]Y(s1)Y(s2)⋮Y(sn)=k=1∑nλkY(sk)
and vector λ is specifically calculated as follows.
λ=Σ−1γ+Σ−1X(XTΣ−1X)−1(x0−XTΣ−1γ)
E[(Y(s0)−h(y))2y]
Our goal is to find ■code.■ that minimizes the above error, given data y=(y(s1),⋯,y(sn)).
=≥E[(Y(s0)−h(y))2y]E[(Y(s0)−E(Y(s0)∣y))2y]+[E(Y(s0)∣y)−h(y)]2E[(Y(s0)−E(Y(s0)∣y))2y]
The necessary and sufficient condition for this equation to hold is ■code.■ h(y)=E(Y(s0)∣y).
Part 2. Conditional Multivariate Normal Distribution
Assuming that the given data follows a Multivariate Normal DistributionNn(0,Σ), and adding a new ■code.■ Y0:=Y(S0)∼N(μ0,σ02) still follows a (1+n) multivariate normal distribution, we start with this assumption. Represented in block matrix form, it looks like the following.
Y0Y1⋮Yn=[Y0Y]∼N1+n([μ0μ],[σ02γγTΣ])
Therefore, given the data Y, the expected value of Y0 at s0 regarding ■code.■ x0=x(s0)∈Rp is
==E(Y(s0)∣Y)μ0+γTΣ−1(Y−Xβ)x0Tβ+γTΣ−1(Y−Xβ)
Meanwhile, the Vector of Regression Coefficientsβ cannot be just determined by OLS [Ordinary Least Square] β^=(XTX)−1XTY since the components of ε following a multivariate normal distribution are not independent. It requires the use of WLS [Weighted Least Square] β^=(XTΣ−1X)−1XTΣ−1Y.
===E(Y(s0)∣Y)x0Tβ+γTΣ−1Y−γTΣ−1Xβx0Tβ+γTΣ−1Y−γTΣ−1XβγTΣ−1Y+x0T(XTΣ−1X)−1XTΣ−1Y−γTΣ−1X(XTΣ−1X)−1XTΣ−1Y
Here, Σ being a Symmetric Matrix implies (Σ−1)T=Σ−1, and combining with Y, we get the following form λTY.
==γTΣ−1Y+x0T(XTΣ−1X)−1XTΣ−1Y−γTΣ−1X(XTΣ−1X)−1XTΣ−1Y[Σ−1γ+Σ−1X(XTΣ−1X)−1(x0−XTΣ−1γ)]TYλTY
■
In actual analysis, estimates for σ02 and Σ are determined through the Semivariogram Model as follows.
σ02=Σ=σ2+τ2σ2H(ϕ)+τ2I
Here, τ2 is the nugget effect variance (seen as the basic covariance despite the distance in real data), and I is the Identity Matrix.
Banerjee. (2015). Hierarchical Modeling and Analysis for Spatial Data(2nd Edition): p41~42. ↩︎