Não foi possível enviar o arquivo. Será algum problema com as permissões?

Essa é uma revisão anterior do documento!


Tabela de conteúdos

Exercícios

Exercícios

Atenção: para visualizar melhor esta página voce deve se autenticar neste wiki. Para isto clique no botão AUTENTICAR no canto inferior desta página e entre com o usuário e senha verao2007

Semana 1

  1. Produce a plot of the Rongelap data in which a continuous colour scale or grey scale is used to indicate the value of the emission count per unit time at each location, and the two sub-areas with the 5 by 5 sub-grids at 50 metre spacing are shown as insets.
  2. Construct a polygonal approximation to the boundary of The Gambia. Construct plots of the malaria data which show the spatial variation in the values of the observed prevalence in each village and of the greenness covariate.
  3. (1) Consider the elevation data as a simple regression problem with elevation as the response and north-south location as the explanatory variable. Fit the standard linear regression model using ordinary least squares. Examine the residuals from the linear model, with a view to deciding whether any more sophisticated treatment of the spatial variation in elevation might be necessary.
  4. (2) Find a geostatistical data-set which interests you.
    1. What scientific questions are the data intended to address? Do these concern estimation, prediction, or testing?
    2. Identify the study region, the design, the response and the covariates, if any.
    3. What is the support of each response?
    4. What is the underlying signal?
    5. If you wished to predict the signal throughout the study region, would you choose to interpolate the response data?
  5. Load the Paraná data-set from geoR using the command
    data(parana)
    and inspect its documentation using
    help(parana)
    . For these data, consider the same questions as were raised in Exercise 1.4.
  6. Read the Chapter 2 of Diggle & Ribeiro (2007) (you can get this chapter here)

Semana 2

  1. (3) load the data sets parana, Ksat e ca20 available in geoR using commands such as:
    data(parana)
    and the documentation describing each data set with the help() function
    help(parana)
    Perform exploratory data analysis and build a model you find suitable for each data.
  2. (3) In the examples above, would you have other candidate models for each data-set?
  3. Inspect an example geoestatistical analysis for the hydraulic conductivity data.
  4. (4) Consider the following two models for a set of responses, Y_i : i=1, ... ,n associated with a sequence of positions x_i: i=1,...,n along a one-dimensional spatial axis x.
    1. Y_{i} = alpha + beta x_{i} + Z_{i}, where alpha and beta are parameters and the Z_{i} are mutually independent with mean zero and variance sigma^2_{Z}.
    2. Y_i = A + B x_i + Z_i where the Z_i are as in (a) but A and B are now random variables, independent of each other and of the Z_i, each with mean zero and respective variances Graph and Graph.
      For each of these models, find the mean and variance of Y_i and the covariance between Y_i and Y_j for any j != i. Given a single realisation of either model, would it be possible to distinguish between them?
  5. (5) Suppose that Graph follows a multivariate Gaussian distribution with Graph and Graph and that the covariance matrix of Y can be expressed as V=\sigma^2 R(phi). Write down the log-likelihood function for Graph based on a single realisation of Y and obtain explicit expressions for the maximum likelihood estimators of mu and sigma^2 when phi is known. Discuss how you would use these expressions to find maximum likelihood estimators numerically when phi is unknown.
  6. (6) Is the following a legitimate correlation function for a one-dimensional spatial process Graph? Give either a proof or a counter-example.

rho(u) = delim{lbrace}{matrix{2}{1}{{1-u : 0 <= u <= 1}{0  :  u>1}}}{}

  1. (7) Consider the following method of simulating a realisation of a one-dimensional spatial process on Graph, with mean zero, variance 1 and correlation function rho(u). Choose a set of points Graph. Let R denote the correlation matrix of Graph. Obtain the singular value decomposition of R as Graph where Lambda is a diagonal matrix whose non-zero entries are the eigenvalues of R, in order from largest to smallest. Let Graph be an independent random sample from the standard Gaussian distribution, Graph. Then the simulated realisation is Graph
  2. (7) Write an R function to simulate realisations using the above method for any specified set of points $x_i$ and a range of correlation functions of your choice. Use your function to simulate a realisation of S on (a discrete approximation to) the unit interval (0,1).
  3. (7) Now investigate how the appearance of your realisation S changes if in the equation above you replace the diagonal matrix Lambda by truncated form in which you replace the last k eigenvalues by zeros.

Semana 3

  1. (8) Fit a model to the surface elevation data assuming a linear trend model on the coordinates and a Matérn correlation function with parameter kappa=2.5. Use the fitted model as the true model and perform a simulation study (i.e. simulate from this model) to compare parameter estimation based on maximum likelihood, restricted maximum likelihood and variograms.
  2. (9) Simulate 200 points in the unit square from the Gaussian model without measurement error, constant mean equals to zero, unit variance and exponential correlation function with phi=0.25 and anisotropy parameters (psi_A=pi/3, psi_R=2). Obtain parameter estimates (using maximum likelihood):
    • assuming a isotropic model
    • try to estimate the anisotropy parameters
      Compare the results and repeat the exercise for phi_R=4.
  3. (10) Consider a stationary trans-Gaussian model with known transformation function Graph, let $x$ be an arbitrary

location within the study region and define T=h^{-1}(S(x)). Find explicit expressions for Graph where Y=(Y_1,...,Y_n) denotes the observed measurements on the untransformed scale and:

  • h(u)=u
  • h(u) = log{u}
  • h(u) = sqrt{u}.
  1. (11) Analyse the Paraná data-set or any other data set of your choice assuming priors obtaining:
  • a map of the predicted values over the area
  • a map of the predicted std errors over the area
  • a map of the probabilities of being above a certain (arbitrarily) chosen threshold over the area
  • a map of the 10th, 25th, 50th, 75th and 90th percentiles over the area
  • the predictive distribution of the proportion of the area with the value of the study variable below a certain threshold. (as a suggestion you can use the 30th percentile of the data as the value of such a threshold)

Semana 4

  1. (12) Consider the stationary Gaussian model in which Y_i = beta + S(x_i) + Z_i :i=1,...,n, where S(x) is a stationary Gaussian process with mean zero, variance sigma^2 and correlation function rho(u), whilst the Z_i are mutually independent Graph random variables. Assume that all parameters except beta are known. Derive the Bayesian predictive distribution of S(x) for an arbitrary location x when beta is assigned an improper uniform prior, pi(beta) constant for all real beta. Compare the result with the ordinary kriging formulae.
  2. (13) For the model assumed in the previous exercise, assuming a correlation function parametrised by a scalar parameter phi obtain the posterior distribution for:
    • a normal prior for beta and assuming the remaining parameters are known
    • a normal-scaled-inverse-Graph prior for Graph and assuming the correlation parameter is known
    • a normal-scaled-inverse-chi^2 prior for (beta, sigma^2|phi) and assuming a generic prior p(phi) for correlation parameter.
  3. (14) Analise the Paraná data-set or any other data set of your choice assuming priors for the model parameters and obtaining:
    • the posterior distribution for the model parameters
    • a map of the predictive mean over the area
    • a map of the predictive median over the area
    • the predictive distribution at three arbitrary selected locations within the area
  4. (15) Obtain simulations from the Poison model as shown in Figure 4.1 of the text book for the course.
  5. (15) Try to reproduce or mimic the results shown in Figure 4.2 of the text book for the course simulating a data set and obtaining a similar data-analysis. Note: for the example in the book we have used set.seed(34).
  6. (16) Reproduce the simulated binomial data shown in Figure 4.6. Use the package geoRglm in conjunction with priors of your choice to obtain predictive distributions for the signal $S(x)$ at locations Graph and Graph. Compare the predictive inferences which you obtained in the previous exercise with those obtained by fitting a linear Gaussian model to the empirical logit transformed data, log{(y+0.5)/(n-y+0.5)}. Compare the results of the two previous analysis and comment generally.

Semana 5

  1. (17) The composite likelihood (CL) is obtained by the product of independent distributions for pairs of variables at data locations. Assume a Gaussian model with constant mean and isotropic exponential correlation function.
    • write down the expression of the CL and discuss how parameter estimates could be obtained
    • write down a code to obtain CL parameter estimates for the s100 data set and compare with the ones given by ML and REML.

QR Code
QR Code disciplinas:verao2007:exercicios (generated for current page)