@@ -137,8 +137,7 @@ old male blue collar worker who was employed in 1997, using the posterior mean e
137137(p_unemploy_base <- round(pnorm(res_beta[1, 2]),4))
138138```
139139
140- The estimated risk to be unemployed in 1998 for a baseline person is very low with a value of ` r p_unemploy_base ` and even lower
141- for a white collar worker. This risk is higher for a female,
140+ The estimated risk to be unemployed in 1998 for a baseline person is very low with a value of ` r p_unemploy_base ` and even lower for a white collar worker. This risk is higher for a female,
142141an older person and particularly high if the person was unemployed in 1997.
143142
144143We next visualize the estimated posterior distributions for the regression
@@ -177,17 +176,18 @@ assess the efficiency of the sampler.
177176library("coda")
178177ess <- round(coda::effectiveSize(betas),1)
179178ineff <- round(M/ ess,2)
179+
180180res_eff <-cbind(ess, ineff)
181181knitr::kable(res_eff)
182182```
183- The effective sample size is larger than 3000 for each regression effect, thus yielding inefficiency factors below ` r min (ineff) ` .
183+ The effective sample size is larger than ` r min(ess) ` for each regression effect, thus yielding inefficiency factors below ` r max (ineff) ` .
184184
185185The sampler is easy to implement, however there might be problems when the response variable contains either only very few or very many successes.
186186
187187### Example 8.3
188188
189- To illustrate this issue, we use data where in $N = 500$ trials only 1
190- success or only 1 failure is observed.
189+ To illustrate this issue, we use data where in $N = 500$ trials only 1 failure
190+ or only 1 success is observed.
191191
192192``` {r}
193193set.seed(1234)
@@ -202,28 +202,36 @@ y2 <- c(rep(0, N-1), 1)
202202betas2 <- probit(y2, X, b0 = 0, B0 = 10000, burnin=1000,M=M)
203203```
204204
205- In both cases the autocorrelation of the draws decreases very slowly
206- and remains still high even for higher lags. Also the ESSs are substantially reduced.
205+ In both cases the empirical autocorrelation of the draws decreases very slowly and remains high even for a lag of 40.
207206
208207``` {r, echo = -c(1:2)}
209208if (pdfplots) {
210209 pdf("8-1_3.pdf", width = 8, height = 5)
211210}
212- par(mfrow = c(2, 2), mar = c(2.5, 1.5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
213- plot(betas1, type = "l", main = "", xlab = "Draws after burnin", ylab = "")
214- acf(betas1)
211+ labels <- expression(beta[0])
212+
213+ par(mfrow = c(2, 2), mar = c(2.5, 2.5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
214+
215+ plot(betas1, type = "l", main = "N=500, 1 failure", xlab = "Draws after burnin",
216+ ylab = labels)
217+ acf(betas1, ylab="empirical ACF")
215218
216- round(M/ effectiveSize(betas1),2)
219+ (ess1 <- effectiveSize(betas1))
220+ round(M/ ess1,2)
217221
218- plot(betas2, type = "l", main = "", xlab = "Draws after burnin", ylab = "")
219- acf(betas2)
222+ plot(betas2, type = "l", main = "N=500, 1 success", xlab = "Draws after burnin",
223+ ylab = labels)
224+ acf(betas2, ylab="empirical ACF")
220225
221- round(M/ effectiveSize(betas2),2)
226+ (ess2<- effectiveSize(betas2))
227+ round(M/ ess2,2)
222228```
229+ Hence for these data sets the estimated ESS of the intercept has a value of ~ 150, yielding an inefficiency factor of ~ 130.
223230
224- High autocorrelation in MCMC draws for probit models not only occurs
225- if successes or failures are very rare, but also when a covariate (or
226- a linear combination of covariates) perfectly allows to predict
231+
232+ High autocorrelation in MCMC draws for probit models occur not only when
233+ either successes or failures are rare, but also when a covariate (or a
234+ linear combination of covariates) perfectly allows to predict
227235successes and/or failures. Complete separation means that both
228236successes and failures can be perfectly predicted by a covariate,
229237whereas quasi-complete separation means that either successes or
@@ -245,33 +253,39 @@ y <- rep(c(0, 1), c(ns, N - ns))
245253table(x.sep, y)
246254```
247255
248- We estimate the model parameters and plot the ACF of the draws. We see very high autocorrelations even at
249- lag 35 and hence the ESSs are very low.
256+ We estimate the model parameters $\boldsymbol{\beta}= \beta_0, \beta_1)$
257+ under the Normal prior $\Normal{mathbf{0}, 10000\mathbf{I}}$ and run the sampler for $M=20000$ iterations after a burnin of 1000.
258+
259+ From the plot of the ACF of the draws we see that auto
260+ are close to 1 even at lag 40.
250261
251262``` {r, echo = -c(1:2)}
252263if (pdfplots) {
253264 pdf("8-1_4.pdf", width = 8, height = 5)
254265}
255- par(mfrow = c(2, 2), mar = c(2.5, 1 .5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
266+ par(mfrow = c(2, 2), mar = c(2.5, 2 .5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
256267
257268set.seed(1234)
258269X.sep <- cbind(rep(1, N), x.sep)
259- betas.sep <- probit(y, X.sep, b0 = 0, B0 = 10000)
270+ betas.sep <- probit(y, X.sep, b0 = 0, B0 = 10000,burnin=1000,M=M )
260271
261- plot(betas.sep[, 1], type = "l", xlab = "Draws after burnin", ylab = "")
262- acf(betas.sep[, 1])
272+ labels <- expression(beta[0], beta[1])
273+ plot(betas.sep[, 1], type = "l", xlab = "Draws after burnin", ylab = labels[1])
274+ acf(betas.sep[, 1], ylab="empirical ACF")
263275
264- plot(betas.sep[, 2], type = "l", main = "", xlab = "" , ylab = "" )
265- acf(betas.sep[, 2])
276+ plot(betas.sep[, 2], type = "l", xlab = "Draws after burnin" , ylab =labels[2] )
277+ acf(betas.sep[, 2], ylab="empirical ACF" )
266278
267- effectiveSize(betas.sep)
268- round(M/ effectiveSize(betas .sep) ,2)
279+ (ess.sep<- round(coda:: effectiveSize(betas.sep),2) )
280+ round(M/ ess .sep,2)
269281```
282+ Hence the ESSs are very low with a value of ~ 8, resulting in inefficiency factors
283+ of ~ 2400.
270284
271285### Example 8.5
272286
273287To illustrate quasi-separation we use the same responses as in Example
274- 8.3 , but now set $x=1$ for all successes and additionally for 100
288+ 8.4 , but now set $x=1$ for all successes and additionally for 100
275289failures. Hence for $x=0$ always a failure is observed, whereas for
276290$x=1$ both successes and failures occur.
277291
@@ -280,34 +294,33 @@ x.qus1 <- rep(c(0, 1), c(ns-100, N - ns+100))
280294table(x.qus1, y)
281295```
282296
283- Again autocorrelations are very high for both the intercept as well as
284- the covariate effect and the ESSs are very low.
285-
297+ We again estimate the regression effects using data augmentation and Gibbs Sampling.
286298``` {r, echo = -c(1:2)}
287299if (pdfplots) {
288300 pdf("8-1_5.pdf", width = 8, height = 5)
289301}
290302
291- par(mfrow = c(2, 2), mar = c(2.5, 1 .5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
303+ par(mfrow = c(2, 2), mar = c(2.5, 2 .5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
292304
293305set.seed(1234)
294306X.qus1 <- cbind(rep(1, N), x.qus1)
295- betas.qus1 <- probit(y, X.qus1, b0 = 0, B0 = 10000)
307+ betas.qus1 <- probit(y, X.qus1, b0 = 0, B0 = 10000, burnin=1000, M=M )
296308
297- plot(betas.qus1[, 1], type = "l", main = "", xlab = "" , ylab = "" )
298- acf(betas.qus1[, 1])
309+ plot(betas.qus1[, 1], type = "l", xlab="Draws after burnin" , ylab = labels[1] )
310+ acf(betas.qus1[, 1],ylab="empirical ACF" )
299311
300- plot(betas.qus1[, 2], type = "l", main = "", xlab = "" , ylab = "" )
301- acf(betas.qus1[, 2])
312+ plot(betas.qus1[, 2], type = "l", xlab = "Draws after burnin" , ylab = labels[2] )
313+ acf(betas.qus1[, 2],ylab="empirical ACF" )
302314
303- coda::effectiveSize(betas.qus1)
315+ (ess.qus1 <- round(coda::effectiveSize(betas.qus1),2))
316+ round(M/ ess.qus1,2)
304317```
318+ Again autocorrelations are very high for both the intercept as well as
319+ the covariate effect resulting in high inefficiency factors of ~ 2340.
320+
305321
306- If we change the setting so that $x$ takes values of $0$ not only for
307- failures but also for some successes, whereas $x=1$ for all successes,
308- we observe low autocorrelations for the intercept but still very high
309- autocorrelations for the covariate effect. Also the ESSs are high for
310- the intercept and low for the covariate effect.
322+ We now change the setting so that $x$ takes values of $0$ not only for
323+ failures but also for some successes, whereas $x=1$ for all successes.
311324
312325``` {r, echo = -c(1:2)}
313326if (pdfplots) {
@@ -319,17 +332,23 @@ table(x.qus2, y)
319332
320333set.seed(1234)
321334X.qus2 <- cbind(rep(1, N), x.qus2)
322- betas.qus2 <- probit(y, X.qus2, b0 = 0, B0 = 10000)
335+ betas.qus2 <- probit(y, X.qus2, b0 = 0, B0 = 10000, burnin=1000, M=M )
323336
324- par(mfrow = c(2, 2), mar = c(2.5, 1 .5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
325- plot(betas.qus2[, 1], type = "l", main = "", xlab = "" , ylab = "" )
326- acf(betas.qus2[, 1])
337+ par(mfrow = c(2, 2), mar = c(2.5, 2 .5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
338+ plot(betas.qus2[, 1], type = "l", xlab="Draws after burnin" , ylab = labels[1] )
339+ acf(betas.qus2[, 1], ylab="empirical ACF" )
327340
328- plot(betas.qus2[, 2], type = "l", main = "", xlab = "" , ylab = "" )
329- acf(betas.qus2[, 2])
341+ plot(betas.qus2[, 2], type = "l", xlab="Draws after burnin" , ylab = labels[2] )
342+ acf(betas.qus2[, 2], ylab="empirical ACF" )
330343
331- effectiveSize(betas.qus2)
344+ (ess.qus2 <- round(coda::effectiveSize(betas.qus2),2))
345+ (ineff.qus2 <- round(M/ ess.qus2,2))
332346```
347+ Autocorrelations of the intercept are low and close to zero for small lags but
348+ remain very high for the covariate effect. Hence we have a high ESS for
349+ the intercept (` r ess.qus2[1] ` ) and low for the covariate effect (` r ess.qus2[2] ` ), resulting in an inefficiency
350+ factor of ` r ineff.qus2[1] ` for the intercept, but of ` r ineff.qus2[2] ` for the
351+ effect of the covariate.
333352
334353High autocorrelations typically indicate problems with the sampler. If
335354there is complete or quasi-complete separation in the data, the
@@ -339,38 +358,57 @@ regression effects will result in an improper posterior
339358distribution. Hence, a proper prior is required to avoid improper
340359posteriors in case of separation.
341360
342- We now analyze the data under a more informative prior, i.e.,
361+ We now analyze the data of example 8.4. under the more informative prior
343362$\mathcal{N}(\mathbf{0}, \mathbf{I})$. With this prior we assume that both
344363$P(y=1)$ and $P(y=0)$ have a prior probability of $\approx 0.95$ to be
345364in the interval $[ 0.023, 0.977] $.
346365
366+
367+ We compare the estimation results to those from exercise 8.4, where the prior was
368+ $\mathcal{N}(\mathbf{0}, 10000 \mathbf{I})$
369+
347370``` {r}
348371set.seed(1234)
349- betas.sep1 <- probit(y, X.sep, b0 = 0, B0 = 1)
372+ betas.sep1 <- probit(y, X.sep, b0 = 0, B0 = 1, burnin=1000, M=M)
373+
374+ res_betas.sep <- t(apply(betas.sep, 2, res.mcmc))
375+ rownames(res_betas.sep) <- c("Intercept", "X")
376+ knitr:: kable(res_betas.sep)
350377
351378res_betas.sep1 <- t(apply(betas.sep1, 2, res.mcmc))
352- knitr::kable(round(res_betas.sep1, 3))
379+ rownames(res_betas.sep1)<- c("Intercept", "X")
380+ knitr:: kable(res_betas.sep1)
353381```
354-
355- In this case the autocorrelations are much lower and the ESSs are
356- roughly 600.
382+ We see that with the tighter prior the estimates are more shrunk to zero.
357383
358384``` {r, echo = -c(1:2)}
359385if (pdfplots) {
360386 pdf("8-1_6.pdf", width = 8, height = 5)
361387}
362388
363- par(mfrow = c(2, 2), mar = c(2.5, 1 .5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
389+ par(mfrow = c(2, 2), mar = c(2.5, 2 .5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
364390
365- plot(betas.sep1[, 1], type = "l", main = "", xlab = "" , ylab = "" )
366- acf(betas.sep1[, 1])
391+ plot(betas.sep1[, 1], type = "l", xlab="Draws after burnin" , ylab = labels[1] )
392+ acf(betas.sep1[, 1], ylab="empirical ACF" )
367393
368- plot(betas.sep1[, 2], type = "l", main = "", xlab = "" , ylab = "" )
369- acf(betas.sep1[, 2])
394+ plot(betas.sep1[, 2], type = "l", xlab="Draws after burnin" , ylab = labels[2] )
395+ acf(betas.sep1[, 2],ylab="empirical ACF" )
370396
371- effectiveSize(betas.sep1)
397+ (ess.sep <- round(effectiveSize(betas.sep),2))
398+ (ineff.sep <- round(M/ ess.sep,2))
399+
400+ (ess.sep1 <- round(effectiveSize(betas.sep1),2))
401+ (ineff.sep1 <- round(M/ ess.sep1,2))
402+
403+ ineff<- cbind(ineff.sep, ineff.sep1)
404+ colnames(ineff)<- c("N(0,10000)", "N(0,1)")
405+ rownames(ineff)<- c("Intercept", "X")
406+ knitr::kable(ineff)
372407```
373408
409+ With a tighter prior the autocorrelations are much lower and hence the ESSs are higher and inefficiency is lower.
410+
411+
374412## Section 8.1.2: Logit model
375413
376414### Example 8.6: Labor market data
0 commit comments