Não foi possível enviar o arquivo. Será algum problema com as permissões?
Diferenças
Aqui você vê as diferenças entre duas revisões dessa página.
Ambos lados da revisão anterior Revisão anterior Próxima revisão | Revisão anterior | ||
disciplinas:verao2007:exercicios [2007/02/17 22:48] paulojus |
disciplinas:verao2007:exercicios [2007/02/18 20:16] (atual) paulojus |
||
---|---|---|---|
Linha 18: | Linha 18: | ||
- (3) load the data sets ''parana'', ''Ksat'' e ''ca20'' available in ''geoR'' using commands such as: <code R>data(parana)</code> and the documentation describing each data set with the ''help()'' function <code R>help(parana)</code> Perform exploratory data analysis and build a model you find suitable for each data. | - (3) load the data sets ''parana'', ''Ksat'' e ''ca20'' available in ''geoR'' using commands such as: <code R>data(parana)</code> and the documentation describing each data set with the ''help()'' function <code R>help(parana)</code> Perform exploratory data analysis and build a model you find suitable for each data. | ||
- | - (3) In the examples above, would you have othe //candidate// models for each data-set? | + | - (3) In the examples above, would you have other //candidate// models for each data-set? |
- Inspect [[http://leg.ufpr.br/geoR/tutorials/Rcruciani.R|an example geoestatistical analysis]] for the hydraulic conductivity data. | - Inspect [[http://leg.ufpr.br/geoR/tutorials/Rcruciani.R|an example geoestatistical analysis]] for the hydraulic conductivity data. | ||
- (4) Consider the following two models for a set of responses, <m>Y_i : i=1, ... ,n</m> associated with a sequence of positions <m>x_i: i=1,...,n</m> along a one-dimensional spatial axis <m>x</m>. | - (4) Consider the following two models for a set of responses, <m>Y_i : i=1, ... ,n</m> associated with a sequence of positions <m>x_i: i=1,...,n</m> along a one-dimensional spatial axis <m>x</m>. | ||
- <m>Y_{i} = alpha + beta x_{i} + Z_{i}</m>, where <m>alpha</m> and <m>beta</m> are parameters and the <m>Z_{i}</m> are mutually independent with mean zero and variance <m>sigma^2_{Z}</m>. | - <m>Y_{i} = alpha + beta x_{i} + Z_{i}</m>, where <m>alpha</m> and <m>beta</m> are parameters and the <m>Z_{i}</m> are mutually independent with mean zero and variance <m>sigma^2_{Z}</m>. | ||
- | - <m>Y_i = A + B x_i + Z_i</m> where the $Z_i$ are as in (a) but //A// and //B// are now random variables, independent of each other and of the $Z_i$, each with mean zero and respective variances $\sigma_A^2$ and $\sigma_B^2$.\\ For each of these models, find the mean and variance of $Y_i$ and the covariance between $Y_i$ and $Y_j$ for any $j \neq i$. Given a single realisation of either model, would it be possible to distinguish between them? | + | - <m>Y_i = A + B x_i + Z_i</m> where the <m>Z_i</m> are as in (a) but //A// and //B// are now random variables, independent of each other and of the <m>Z_i</m>, each with mean zero and respective variances <latex>$\sigma_A^2$</latex> and <latex>$\sigma_B^2$</latex>.\\ For each of these models, find the mean and variance of <m>Y_i</m> and the covariance between <m>Y_i</m> and <m>Y_j</m> for any <m>j != i</m>. Given a single realisation of either model, would it be possible to distinguish between them? |
- | - (5) Suppose that $Y=(Y_1,\ldots,Y_n)$ follows a multivariate Gaussian distribution with ${\rm E}[Y_i]=\mu$ and ${\rm Var}\{Y_i\}=\sigma^2$ and that the covariance matrix of $Y$ can be expressed as $V=\sigma^2 R(\phi)$. Write down the log-likelihood function for $\theta=(\mu,\sigma^2,\phi)$ based on a single realisation of $Y$ and obtain explicit expressions for the maximum likelihood estimators of $\mu$ and $\sigma^2$ when $\phi$ is known. Discuss how you would use these expressions to find maximum likelihood estimators numerically when $\phi$ is unknown. | + | - (5) Suppose that <latex>$Y=(Y_1,\ldots,Y_n)$</latex> follows a multivariate Gaussian distribution with <latex>${\rm E}[Y_i]=\mu$</latex> and <latex>${\rm Var}\{Y_i\}=\sigma^2$</latex> and that the covariance matrix of <m>Y</m> can be expressed as <m>V=\sigma^2 R(phi)</m>. Write down the log-likelihood function for <latex>$\theta=(\mu,\sigma^2,\phi)$</latex> based on a single realisation of <m>Y</m> and obtain explicit expressions for the maximum likelihood estimators of <m>mu</m> and <m>sigma^2</m> when <m>phi</m> is known. Discuss how you would use these expressions to find maximum likelihood estimators numerically when <m>phi</m> is unknown. |
- | - (6) Is the following a legitimate correlation function for a one-dimensional spatial process $S(x) : x \in \IR$? Give either a proof or a counter-example.\\ | + | - (6) Is the following a legitimate correlation function for a one-dimensional spatial process <latex>$S(x) : x \in R$</latex>? Give either a proof or a counter-example.\\ |
<m> rho(u) = delim{lbrace}{matrix{2}{1}{{1-u : 0 <= u <= 1}{0 : u>1}}}{} </m>\\ | <m> rho(u) = delim{lbrace}{matrix{2}{1}{{1-u : 0 <= u <= 1}{0 : u>1}}}{} </m>\\ | ||
- | - (7) Consider the following method of simulating a realisation of a one-dimensional spatial process on $S(x) : x \in \IR$, with mean zero, variance 1 and correlation function $\rho(u)$. Choose a set of points $x_i \in \IR : i=1,\ldots,n$. Let $R$ denote the correlation matrix of $S=\{S(x_1),\ldots,S(x_n)\}$. Obtain the singular value decomposition of $R$ as $R = D \Lambda D^\prime$ where $\lambda$ is a diagonal matrix whose non-zero entries are the eigenvalues of $R$, in order from largest to smallest. Let $Y=\{Y_1,\ldots,Y_n\}$ be an independent random sample from the standard Gaussian distribution, ${\rm N}(0,1)$. Then the simulated realisation is <latex>$S = D \Lambda^{\frac{1}{2}} Y$</latex> | + | - (7) Consider the following method of simulating a realisation of a one-dimensional spatial process on <latex>$S(x) : x \in R$</latex>, with mean zero, variance 1 and correlation function <m>rho(u)</m>. Choose a set of points <latex>$x_i \in \R : i=1,\ldots,n$</latex>. Let <m>R</m> denote the correlation matrix of <latex>$S=\{S(x_1),\ldots,S(x_n)\}$</latex>. Obtain the singular value decomposition of <m>R</m> as <latex>$R = D \Lambda D^\prime$</latex> where <m>Lambda</m> is a diagonal matrix whose non-zero entries are the eigenvalues of <m>R</m>, in order from largest to smallest. Let <latex>$Y=\{Y_1,\ldots,Y_n\}$</latex> be an independent random sample from the standard Gaussian distribution, <latex>${\rm N}(0,1)$</latex>. Then the simulated realisation is <latex>$S = D \Lambda^{\frac{1}{2}} Y$</latex> |
- | - (7) Write an ''R'' function to simulate realisations using the above method for any specified set of points $x_i$ and a range of correlation functions of your choice. Use your function to simulate a realisation of $S$ on (a discrete approximation to) the unit interval $(0,1)$. | + | - (7) Write an ''R'' function to simulate realisations using the above method for any specified set of points <m>x_i</m> and a range of correlation functions of your choice. Use your function to simulate a realisation of <m>S</m> on (a discrete approximation to) the unit interval <m>(0,1)</m>. |
- | - (7) Now investigate how the appearance of your realisation $S$ changes if in the equation above you replace the diagonal matrix $\Lambda$ by truncated form in which you replace the last $k$ eigenvalues by zeros. | + | - (7) Now investigate how the appearance of your realisation <m>S</m> changes if in the equation above you replace the diagonal matrix <m>Lambda</m> by truncated form in which you replace the last <m>k</m> eigenvalues by zeros. |
==== Semana 3 ==== | ==== Semana 3 ==== | ||
- | - (8) Fit a model to the surface elevation data assuming a linear trend model on the coordinates and a Matérn correlation function with parameter kappa=2.5. Use the fitted model as the true model and perform a simulation study (i.e. simulate from this model) to compare parameter estimation based on maximum likelihood, restricted maximum likelihood and variograms. | + | - (8) Fit a model to the surface elevation data assuming a linear trend model on the coordinates and a Matérn correlation function with parameter <m>kappa=2.5</m>. Use the fitted model as the true model and perform a simulation study (i.e. simulate from this model) to compare parameter estimation based on maximum likelihood, restricted maximum likelihood and variograms. |
- | - (9) Simulate 200 points in the unit square from the Gaussian model without measurement error, constant mean equals to zero, unit variance and exponential correlation function with $\phi=0.25$ and anisotropy parameters $(\psi_A=\pi/3, \psi_R=2)$. Obtain parameter estimates (using maximum likelihood): | + | - (9) Simulate 200 points in the unit square from the Gaussian model without measurement error, constant mean equals to zero, unit variance and exponential correlation function with <m>phi=0.25</m> and anisotropy parameters <m>(psi_A=pi/3, psi_R=2)</m>. Obtain parameter estimates (using maximum likelihood): |
* assuming a isotropic model | * assuming a isotropic model | ||
- | * try to estimate the anisotropy parameters \\ Compare the results and repeat the exercise for $\phi_R=4$. | + | * try to estimate the anisotropy parameters \\ Compare the results and repeat the exercise for <m>phi_R=4</m>. |
- | - (10) Consider a stationary trans-Gaussian model with known transformation function $h(\cdot)$, let $x$ be an arbitrary | + | - (10) Consider a stationary trans-Gaussian model with known transformation function <latex>$h(\cdot)$</latex>, let $x$ be an arbitrary |
- | location within the study region and define <m>T=h^{-1}{S(x)}</m>. Find explicit expressions for ${\rm P}(T>c|Y)$ where | + | location within the study region and define <m>T=h^{-1}(S(x))</m>. Find explicit expressions for <latex>${\rm P}(T>c|Y)$</latex> where <m>Y=(Y_1,...,Y_n)</m> denotes the observed measurements on the untransformed scale and: |
- | $Y=(Y_1,...,Y_n)$ denotes the observed measurements on the untransformed scale and: | + | |
* <m>h(u)=u</m> | * <m>h(u)=u</m> | ||
- | * <m>h(u) = \log u</m> | + | * <m>h(u) = log{u}</m> |
* <m>h(u) = sqrt{u}</m>. | * <m>h(u) = sqrt{u}</m>. | ||
- (11) Analyse the Paraná data-set or any other data set of your choice assuming priors obtaining: | - (11) Analyse the Paraná data-set or any other data set of your choice assuming priors obtaining: | ||
* a map of the predicted values over the area | * a map of the predicted values over the area | ||
* a map of the predicted std errors over the area | * a map of the predicted std errors over the area | ||
- | * a map of the probabilities of being above a certain (arbitrarily) choosen threshold over the area | + | * a map of the probabilities of being above a certain (arbitrarily) chosen threshold over the area |
* a map of the 10th, 25th, 50th, 75th and 90th percentiles over the area | * a map of the 10th, 25th, 50th, 75th and 90th percentiles over the area | ||
- | * the predictive distribution of the porportion of the area with the value of the study variable below a certain threshold. (as a suggestion you can use the 30th percentile of the data as the value of such a threshold) | + | * the predictive distribution of the proportion of the area with the value of the study variable below a certain threshold. (as a suggestion you can use the 30th percentile of the data as the value of such a threshold) |
==== Semana 4 ==== | ==== Semana 4 ==== | ||
- | - (12) Consider the stationary Gaussian model in which $Y_i = \beta + S(x_i) + Z_i :i=1,\ldots,n$, where $S(x)$ is a stationary Gaussian process with mean zero, variance $\sigma^2$ and correlation function $\rho(u)$, whilst the $Z_i$ are mutually independent ${\rm N}(0,\tau^2)$ random variables. Assume that all parameters except $\beta$ are known. Derive the Bayesian predictive distribution of $S(x)$ for an arbitrary location $x$ when $\beta$ is assigned an improper uniform prior, $\pi(\beta)$ constant for all real $\beta$. Compare the result with the ordinary kriging formulae. | + | - (12) Consider the stationary Gaussian model in which <m>Y_i = beta + S(x_i) + Z_i :i=1,...,n</m>, where <m>S(x)</m> is a stationary Gaussian process with mean zero, variance <m>sigma^2</m> and correlation function <m>rho(u)</m>, whilst the <m>Z_i</m> are mutually independent <latex>${\rm N}(0,\tau^2)$</latex> random variables. Assume that all parameters except <m>beta</m> are known. Derive the Bayesian predictive distribution of <m>S(x)</m> for an arbitrary location <m>x</m> when <m>beta</m> is assigned an improper uniform prior, <m>pi(beta)</m> constant for all real <m>beta</m>. Compare the result with the ordinary kriging formulae. |
- | - (13) For the model assumed in the previous exercise, assuming a correlation function parametrised by a scalar parameter $\phi$ obtain the posterior distribution for: | + | - (13) For the model assumed in the previous exercise, assuming a correlation function parametrised by a scalar parameter <m>phi</m> obtain the posterior distribution for: |
- | * a normal prior for $\beta$ and assuming the remaining parameters are known | + | * a normal prior for <m>beta</m> and assuming the remaining parameters are known |
- | * a normal-scaled-inverse-$chi^2$ prior for $(\beta, \sigma^2)$ and assuming the correlation parameter is known | + | * a normal-scaled-inverse-<latex>$\chi^2$</latex> prior for <latex>$(\beta, \sigma^2)$</latex> and assuming the correlation parameter is known |
- | * a normal-scaled-inverse-$chi^2$ prior for $(\beta, \sigma^2|\phi)$ and assuming a generic prior $p(\phi)$ for correlation parameter. | + | * a normal-scaled-inverse-<m>chi^2</m> prior for <m>(beta, sigma^2|phi)</m> and assuming a generic prior <m>p(phi)</m> for correlation parameter. |
- | - (14) Analyse the Paraná data-set or any other data set of your choice assuming priors for the model parameters and obtaining: | + | - (14) Analise the Paraná data-set or any other data set of your choice assuming priors for the model parameters and obtaining: |
* the posterior distribution for the model parameters | * the posterior distribution for the model parameters | ||
* a map of the predictive mean over the area | * a map of the predictive mean over the area | ||
Linha 64: | Linha 63: | ||
- (15) Obtain simulations from the Poison model as shown in Figure 4.1 of the text book for the course. | - (15) Obtain simulations from the Poison model as shown in Figure 4.1 of the text book for the course. | ||
- (15) Try to reproduce or mimic the results shown in Figure 4.2 of the text book for the course simulating a data set and obtaining a similar data-analysis. **Note:** for the example in the book we have used //set.seed(34)//. | - (15) Try to reproduce or mimic the results shown in Figure 4.2 of the text book for the course simulating a data set and obtaining a similar data-analysis. **Note:** for the example in the book we have used //set.seed(34)//. | ||
- | - (16) Reproduce the simulated binomial data shown in Figure 4.6. Use the package //geoRglm// in conjunction with priors of your choice to obtain predictive distributions for the signal $S(x)$ at locations $x=(0.6, 0.6)$ and $x=(0.9, 0.5)$. Compare the predictive inferences which you obtained in the previous exercise with those obtained by fitting a linear Gaussian model to the empirical logit transformed data, <m>log{(y+0.5)/(n-y+0.5)}</m>. Compare the results of the two previous analysis and comment generally. | + | - (16) Reproduce the simulated binomial data shown in Figure 4.6. Use the package //geoRglm// in conjunction with priors of your choice to obtain predictive distributions for the signal <m>S(x)</m> at locations <latex>$x=(0.6, 0.6)$</latex> and <latex>$x=(0.9, 0.5)$</latex>. Compare the predictive inferences which you obtained in the previous exercise with those obtained by fitting a linear Gaussian model to the empirical logit transformed data, <m>log{(y+0.5)/(n-y+0.5)}</m>. Compare the results of the two previous analysis and comment generally. |
==== Semana 5 ==== | ==== Semana 5 ==== |