--- title: "4. Method" output: rmarkdown::html_vignette bibliography: biblio.bib vignette: > %\VignetteIndexEntry{4. Method} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## General setup Let us set up the notations first. Suppose a there exists a partition of a region $\mathrm{D} \in \mathcal{R}^2$ (e.g., a city). This partition is denoted by $A_i$, $i = 1, \ldots, n$. Moreover, there exists another partition of the same city, denoted $B_j$, where $j = 1, \ldots, m$. These partitions can be seen as two different administrative divisions within the same city. It is common for different government agencies to release data for different divisions of a same city, country, or state. ## Model-based approach Consequently, $$\mathrm{E}[Y(A_i)] = \frac{1}{\lvert A_i \rvert} \int_{A_i} \mathrm{E}[Y(\mathbf{s})] \, \mathrm{d} \mathbf{s} = \frac{1}{\lvert A_i \rvert} \int_{A_i} \mu \, \mathrm{d} \mathbf{s} = \mu$$ and \begin{align*} \mathrm{Cov}[Y(A_i), Y(A_j)] & = \frac{1}{\lvert A_i \rvert \lvert A_j \rvert} \int_{A_i \times A_j} \mathrm{Cov}[Y(\mathbf{s}, Y(\mathbf{s}')] \, \mathrm{d} \mathbf{s} \, \mathrm{d} \mathbf{s'} \\ & = \frac{1}{\lvert A_i \rvert \lvert A_j \rvert} \int_{A_i \times A_j} \mathrm{C}( \lVert \mathbf{s} - \mathbf{s}' \rVert ; \boldsymbol{\theta}) \, \mathrm{d} \mathbf{s} \, \mathrm{d} \mathbf{s'}, \end{align*} where $\lVert \mathbf{s} - \mathbf{s}' \rVert$ is the Euclidean distance between the coordinates $\mathbf{s}$ and $\mathbf{s}'$, and $\mathrm{C}( \lVert \mathbf{s} - \mathbf{s}' \rVert ; \boldsymbol{\theta})$ is an isotropic covariance function depending on the parameter $\boldsymbol{\theta}$. Assume we observe a random variable $Y(\cdot)$ at each region $A_i$ and we are interested in predict/estimate this variable in each of the regions $B_j$. Now suppose the random variable $Y(\cdot)$ varies continuously over $\mathrm{D}$ and is defined as follows $$Y(\mathbf{s}) = \mu + S(\mathbf{s}) + \varepsilon(\mathbf{s}), \, \mathbf{s} \in \mathrm{D} \subset \mathcal{R}^2.$$ where $$ S(\cdot) \sim \mathrm{GP}(0, \sigma^2 \rho(\cdot; \, \phi, \kappa)) \; \text{ and } \; \varepsilon(\cdot) \overset{\mathrm{i.i.d.}}{\sim} \mathrm{N}(0, \sigma^2 \rho(\cdot; \, \phi, \kappa)), $$ where $S$ and $\varepsilon$ are independent. For now, let's make the unrealistic assumption that all those parameters are known. Then, our assumption is that the observed data is as follows \begin{align*} Y(A_i) & = \frac{1}{\lvert A_i \rvert} \int_{A_i} Y(\mathbf{s}) \, \mathrm{d} \mathbf{s} \\ & = \frac{1}{\lvert A_i \rvert} \int_{A_i} [\mu + S(\mathbf{s}) + \varepsilon(\mathbf{s})] \, \mathrm{d} \mathbf{s} \\ & = \mu + \frac{1}{\lvert A_i \rvert} \int_{A_i} S(\mathbf{s}) \mathrm{d} \mathbf{s} + \frac{1}{\lvert A_i \rvert} \int_{A_i} \varepsilon(\mathbf{s}) \mathrm{d} \mathbf{s}, \end{align*} where $\lvert \cdot \rvert$ returns the area of a polygon. Furthermore, it can be shown that (using Fubini's Theorem and some algebraic manipulation) $$ \mathrm{Cov}(Y(A_i), Y(A_j)) = \frac{\sigma^2}{\lvert A_i \rvert \lvert A_j \rvert} \int_{A_i \times A_j} \rho( \lVert \mathbf{s} - \mathbf{s}' \rVert; \, \phi, \kappa ) \, \mathrm{d} \mathbf{s} \, \mathrm{d} \mathbf{s}' + \mathbf{I}(i = j) \frac{\tau}{\lvert A_i \rvert}, $$ where $\rho(\cdot ; \, \phi, \kappa)$ is a positive definite correlation function. Now, let $\mathrm{R}_{\kappa}(\phi)$ be a correlation matrix such that $$ \mathrm{R}_{\kappa}(\phi)_{ij} = \frac{1}{\lvert A_i \rvert \lvert A_j \rvert} \int_{A_i \times A_j} \rho( \lVert \mathbf{s} - \mathbf{s}' \rVert; \, \phi, \kappa ) \, \mathrm{d} \mathbf{s} \, \mathrm{d} \mathbf{s}', $$ thus, $$ Y(A_1, \cdots, A_n) \sim \mathrm{N}( \mu \mathbf{1}_n, \sigma^2 \mathrm{R}_{\kappa}(\phi) + \tau \mathrm{diag}(\lvert A_1 \rvert^{-1}, \ldots, \lvert A_1 \rvert^{-1})). $$ Then, if we assume $(Y^{\top}(A_1, \cdots, A_n), Y^{\top}(B_1, \cdots, A_m)^{\top})$ to be jointly normal, we use can the conditional mean of $Y^{\top}(B_1, \cdots, A_m)^{\top}$ given $Y^{\top}(A_1, \cdots, A_n)$ to estimate the observed random variable in the partition $B_1, \ldots, B_m$. --- Now, suppose the parameters $\boldsymbol{\theta} = (\mu, \sigma^2, \phi, \tau)$ are unknown. The Likelihood of $Y(A_1, \ldots, A_n)$ can still be computed. In particular, if we use the parametrization $\alpha = \tau / \sigma^2$, we have closed form for the Maximum Likelihood estimators both for $\mu$ and $\sigma^2$. Thus, we can optimize the profile likelihood for $\phi$ and $\alpha$ numerically. Then, we resort on conditional Normal properties again to compute the predictions in a new different set of regions. ## Areal Interpolation (AI) Areal interpolation is a nonparametric approach that interpolates $Y(A_i)$'s to construct $Y(B_j)$'s. Define an $m \times n$ matrix $\mathbf{W} = \{ w_{ij} \}$, where $w_{ij}$ is the weight associated with the polygon $A_i$ in constructing $Y(B_j)$. The weights are $w_{ij} = \lvert A_i \cap B_j \rvert / \lvert B_j \rvert$ [@goodchild1980areal; @gotway2002combining]. The interpolation for $\hat Y(B_1, \ldots, B_m)$ is constructed as \begin{equation} \label{eq:np-est} \hat{Y}(B_1, \ldots, B_m) = \mathbf{W} Y(A_1, \ldots, A_n). \end{equation} The expectation and variance of the predictor are, respectively, \[ \mathrm{E}[\hat{Y}(B_1, \ldots, B_m)] = \mathbf{W} \mathrm{E}[Y(A_1, \ldots, A_n)] \] and \begin{equation} \label{eq:np-matcov} \textrm{Var}[\hat{Y}(B_1, \ldots, B_m)] = \mathbf{W} \textrm{Var}[Y(A_1, \ldots, A_n)] \mathbf{W}^{\top}. \end{equation} In practice, the covariance matrix $\textrm{Var}[Y(A_1, \ldots, A_n)]$ is unknown and, consequently needs to be estimated. The variance each predictor $\text{Var}[\hat Y(B_i)]$ is needed as an uncertainty measure. It relies on both the variances of $Y(A_j)$'s and their covariances: \begin{align} \label{eq:np-single-var} \textrm{Var}[\hat{Y}(B_i)] = \sum_{i = 1}^n w^2_{ij} \textrm{Var} \left [ Y(A_i) \right ] + 2 \sum_{l \neq i} w_{ij} w_{il} \textrm{Cov} \left[ Y(A_i), Y(A_l) \right]. \end{align} The variances are often observed in survey data, but the covariances are not. For practical purpose, we propose an approximation for $\textrm{Cov}[ Y(A_i), Y(A_l)]$ based on Moran's I, a global spatial autocorrelation. Specifically, let $\rho_I$ be the Moran's I calculated with a weight matrix constructed with first-degree neighbors. That is, $\rho_I$ is the average of the pairwise correlation for all neighboring pairs. For regions $A_i$ and $A_l$, if they are neighbors of each other, our approximation is \begin{align} \label{eq:cova} \textrm{Cov} \left[ Y(A_i), Y(A_l) \right] = \rho_I \sqrt{\text{Var}[Y(A_i)] \text{Var}[Y(A_l)]}. \end{align} The covariance between non-neighboring $A_i$ and $A_l$ is discarded. The final uncertainty approximation for $\textrm{Var}[\hat{Y}(B_i)]$ will be an underestimate. Alternatively, we can derive, at least, an upper bound for the variance of the estimates by using a simple application from the Cauchy--Schwartz inequality, in which case, $\rho_I$ is replaced with~1. # Reference