Let us set up the notations first. Suppose a there exists a partition of a region D ∈ ℛ2 (e.g., a city). This partition is denoted by Ai, i = 1, …, n. Moreover, there exists another partition of the same city, denoted Bj, where j = 1, …, m. These partitions can be seen as two different administrative divisions within the same city. It is common for different government agencies to release data for different divisions of a same city, country, or state.
Consequently,
$$\mathrm{E}[Y(A_i)] = \frac{1}{\lvert A_i \rvert} \int_{A_i} \mathrm{E}[Y(\mathbf{s})] \, \mathrm{d} \mathbf{s} = \frac{1}{\lvert A_i \rvert} \int_{A_i} \mu \, \mathrm{d} \mathbf{s} = \mu$$
and
where ∥s − s′∥ is the Euclidean distance between the coordinates s and s′, and C(∥s − s′∥; θ) is an isotropic covariance function depending on the parameter θ.
Assume we observe a random variable Y(⋅) at each region Ai and we are interested in predict/estimate this variable in each of the regions Bj. Now suppose the random variable Y(⋅) varies continuously over D and is defined as follows Y(s) = μ + S(s) + ε(s), s ∈ D ⊂ ℛ2.
where $$ S(\cdot) \sim \mathrm{GP}(0, \sigma^2 \rho(\cdot; \, \phi, \kappa)) \; \text{ and } \; \varepsilon(\cdot) \overset{\mathrm{i.i.d.}}{\sim} \mathrm{N}(0, \sigma^2 \rho(\cdot; \, \phi, \kappa)), $$ where S and ε are independent. For now, let’s make the unrealistic assumption that all those parameters are known. Then, our assumption is that the observed data is as follows
where |⋅| returns the area of a polygon. Furthermore, it can be shown that (using Fubini’s Theorem and some algebraic manipulation) $$ \mathrm{Cov}(Y(A_i), Y(A_j)) = \frac{\sigma^2}{\lvert A_i \rvert \lvert A_j \rvert} \int_{A_i \times A_j} \rho( \lVert \mathbf{s} - \mathbf{s}' \rVert; \, \phi, \kappa ) \, \mathrm{d} \mathbf{s} \, \mathrm{d} \mathbf{s}' + \mathbf{I}(i = j) \frac{\tau}{\lvert A_i \rvert}, $$ where ρ(⋅; ϕ, κ) is a positive definite correlation function. Now, let Rκ(ϕ) be a correlation matrix such that $$ \mathrm{R}_{\kappa}(\phi)_{ij} = \frac{1}{\lvert A_i \rvert \lvert A_j \rvert} \int_{A_i \times A_j} \rho( \lVert \mathbf{s} - \mathbf{s}' \rVert; \, \phi, \kappa ) \, \mathrm{d} \mathbf{s} \, \mathrm{d} \mathbf{s}', $$ thus, Y(A1, ⋯, An) ∼ N(μ1n, σ2Rκ(ϕ) + τdiag(|A1|−1, …, |A1|−1)). Then, if we assume (Y⊤(A1, ⋯, An), Y⊤(B1, ⋯, Am)⊤) to be jointly normal, we use can the conditional mean of Y⊤(B1, ⋯, Am)⊤ given Y⊤(A1, ⋯, An) to estimate the observed random variable in the partition B1, …, Bm.
Now, suppose the parameters θ = (μ, σ2, ϕ, τ) are unknown. The Likelihood of Y(A1, …, An) can still be computed.
In particular, if we use the parametrization α = τ/σ2, we have closed form for the Maximum Likelihood estimators both for μ and σ2. Thus, we can optimize the profile likelihood for ϕ and α numerically. Then, we resort on conditional Normal properties again to compute the predictions in a new different set of regions.
Areal interpolation is a nonparametric approach that interpolates Y(Ai)’s to construct Y(Bj)’s. Define an m × n matrix W = {wij}, where wij is the weight associated with the polygon Ai in constructing Y(Bj). The weights are wij = |Ai ∩ Bj|/|Bj| (Goodchild and Lam 1980; Gotway and Young 2002). The interpolation for Ŷ(B1, …, Bm) is constructed as The expectation and variance of the predictor are, respectively, E[Ŷ(B1, …, Bm)] = WE[Y(A1, …, An)] and In practice, the covariance matrix Var[Y(A1, …, An)] is unknown and, consequently needs to be estimated.
The variance each predictor Var[Ŷ(Bi)] is needed as an uncertainty measure. It relies on both the variances of Y(Aj)’s and their covariances: The variances are often observed in survey data, but the covariances are not. For practical purpose, we propose an approximation for Cov[Y(Ai), Y(Al)] based on Moran’s I, a global spatial autocorrelation. Specifically, let ρI be the Moran’s I calculated with a weight matrix constructed with first-degree neighbors. That is, ρI is the average of the pairwise correlation for all neighboring pairs. For regions Ai and Al, if they are neighbors of each other, our approximation is The covariance between non-neighboring Ai and Al is discarded. The final uncertainty approximation for Var[Ŷ(Bi)] will be an underestimate. Alternatively, we can derive, at least, an upper bound for the variance of the estimates by using a simple application from the Cauchy–Schwartz inequality, in which case, ρI is replaced with~1.