Skip to content

Commit b5aeac2

Browse files
committed
minor changes in text probit examples
1 parent 6bec3be commit b5aeac2

1 file changed

Lines changed: 100 additions & 62 deletions

File tree

vignettes/Chapter08.Rmd

Lines changed: 100 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -137,8 +137,7 @@ old male blue collar worker who was employed in 1997, using the posterior mean e
137137
(p_unemploy_base <- round(pnorm(res_beta[1, 2]),4))
138138
```
139139

140-
The estimated risk to be unemployed in 1998 for a baseline person is very low with a value of `r p_unemploy_base`and even lower
141-
for a white collar worker. This risk is higher for a female,
140+
The estimated risk to be unemployed in 1998 for a baseline person is very low with a value of `r p_unemploy_base` and even lower for a white collar worker. This risk is higher for a female,
142141
an older person and particularly high if the person was unemployed in 1997.
143142

144143
We next visualize the estimated posterior distributions for the regression
@@ -177,17 +176,18 @@ assess the efficiency of the sampler.
177176
library("coda")
178177
ess <- round(coda::effectiveSize(betas),1)
179178
ineff <- round(M/ ess,2)
179+
180180
res_eff <-cbind(ess, ineff)
181181
knitr::kable(res_eff)
182182
```
183-
The effective sample size is larger than 3000 for each regression effect, thus yielding inefficiency factors below `r min(ineff)`.
183+
The effective sample size is larger than `r min(ess)` for each regression effect, thus yielding inefficiency factors below `r max(ineff)`.
184184

185185
The sampler is easy to implement, however there might be problems when the response variable contains either only very few or very many successes.
186186

187187
### Example 8.3
188188

189-
To illustrate this issue, we use data where in $N = 500$ trials only 1
190-
success or only 1 failure is observed.
189+
To illustrate this issue, we use data where in $N = 500$ trials only 1 failure
190+
or only 1 success is observed.
191191

192192
```{r}
193193
set.seed(1234)
@@ -202,28 +202,36 @@ y2 <- c(rep(0, N-1), 1)
202202
betas2 <- probit(y2, X, b0 = 0, B0 = 10000, burnin=1000,M=M)
203203
```
204204

205-
In both cases the autocorrelation of the draws decreases very slowly
206-
and remains still high even for higher lags. Also the ESSs are substantially reduced.
205+
In both cases the empirical autocorrelation of the draws decreases very slowly and remains high even for a lag of 40.
207206

208207
```{r, echo = -c(1:2)}
209208
if (pdfplots) {
210209
pdf("8-1_3.pdf", width = 8, height = 5)
211210
}
212-
par(mfrow = c(2, 2), mar = c(2.5, 1.5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
213-
plot(betas1, type = "l", main = "", xlab = "Draws after burnin", ylab = "")
214-
acf(betas1)
211+
labels <- expression(beta[0])
212+
213+
par(mfrow = c(2, 2), mar = c(2.5, 2.5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
214+
215+
plot(betas1, type = "l", main = "N=500, 1 failure", xlab = "Draws after burnin",
216+
ylab = labels)
217+
acf(betas1, ylab="empirical ACF")
215218
216-
round(M/ effectiveSize(betas1),2)
219+
(ess1 <- effectiveSize(betas1))
220+
round(M/ ess1,2)
217221
218-
plot(betas2, type = "l", main = "", xlab = "Draws after burnin", ylab = "")
219-
acf(betas2)
222+
plot(betas2, type = "l", main = "N=500, 1 success", xlab = "Draws after burnin",
223+
ylab = labels)
224+
acf(betas2, ylab="empirical ACF")
220225
221-
round(M/ effectiveSize(betas2),2)
226+
(ess2<- effectiveSize(betas2))
227+
round(M/ ess2,2)
222228
```
229+
Hence for these data sets the estimated ESS of the intercept has a value of ~ 150, yielding an inefficiency factor of ~ 130.
223230

224-
High autocorrelation in MCMC draws for probit models not only occurs
225-
if successes or failures are very rare, but also when a covariate (or
226-
a linear combination of covariates) perfectly allows to predict
231+
232+
High autocorrelation in MCMC draws for probit models occur not only when
233+
either successes or failures are rare, but also when a covariate (or a
234+
linear combination of covariates) perfectly allows to predict
227235
successes and/or failures. Complete separation means that both
228236
successes and failures can be perfectly predicted by a covariate,
229237
whereas quasi-complete separation means that either successes or
@@ -245,33 +253,39 @@ y <- rep(c(0, 1), c(ns, N - ns))
245253
table(x.sep, y)
246254
```
247255

248-
We estimate the model parameters and plot the ACF of the draws. We see very high autocorrelations even at
249-
lag 35 and hence the ESSs are very low.
256+
We estimate the model parameters $\boldsymbol{\beta}= \beta_0, \beta_1)$
257+
under the Normal prior $\Normal{mathbf{0}, 10000\mathbf{I}}$ and run the sampler for $M=20000$ iterations after a burnin of 1000.
258+
259+
From the plot of the ACF of the draws we see that auto
260+
are close to 1 even at lag 40.
250261

251262
```{r, echo = -c(1:2)}
252263
if (pdfplots) {
253264
pdf("8-1_4.pdf", width = 8, height = 5)
254265
}
255-
par(mfrow = c(2, 2), mar = c(2.5, 1.5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
266+
par(mfrow = c(2, 2), mar = c(2.5, 2.5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
256267
257268
set.seed(1234)
258269
X.sep <- cbind(rep(1, N), x.sep)
259-
betas.sep <- probit(y, X.sep, b0 = 0, B0 = 10000)
270+
betas.sep <- probit(y, X.sep, b0 = 0, B0 = 10000,burnin=1000,M=M)
260271
261-
plot(betas.sep[, 1], type = "l", xlab = "Draws after burnin", ylab = "")
262-
acf(betas.sep[, 1])
272+
labels <- expression(beta[0], beta[1])
273+
plot(betas.sep[, 1], type = "l", xlab = "Draws after burnin", ylab = labels[1])
274+
acf(betas.sep[, 1], ylab="empirical ACF")
263275
264-
plot(betas.sep[, 2], type = "l", main = "", xlab = "", ylab = "")
265-
acf(betas.sep[, 2])
276+
plot(betas.sep[, 2], type = "l", xlab = "Draws after burnin", ylab =labels[2])
277+
acf(betas.sep[, 2], ylab="empirical ACF")
266278
267-
effectiveSize(betas.sep)
268-
round(M/ effectiveSize(betas.sep),2)
279+
(ess.sep<- round(coda::effectiveSize(betas.sep),2))
280+
round(M/ ess.sep,2)
269281
```
282+
Hence the ESSs are very low with a value of ~ 8, resulting in inefficiency factors
283+
of ~2400.
270284

271285
### Example 8.5
272286

273287
To illustrate quasi-separation we use the same responses as in Example
274-
8.3, but now set $x=1$ for all successes and additionally for 100
288+
8.4, but now set $x=1$ for all successes and additionally for 100
275289
failures. Hence for $x=0$ always a failure is observed, whereas for
276290
$x=1$ both successes and failures occur.
277291

@@ -280,34 +294,33 @@ x.qus1 <- rep(c(0, 1), c(ns-100, N - ns+100))
280294
table(x.qus1, y)
281295
```
282296

283-
Again autocorrelations are very high for both the intercept as well as
284-
the covariate effect and the ESSs are very low.
285-
297+
We again estimate the regression effects using data augmentation and Gibbs Sampling.
286298
```{r, echo = -c(1:2)}
287299
if (pdfplots) {
288300
pdf("8-1_5.pdf", width = 8, height = 5)
289301
}
290302
291-
par(mfrow = c(2, 2), mar = c(2.5, 1.5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
303+
par(mfrow = c(2, 2), mar = c(2.5, 2.5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
292304
293305
set.seed(1234)
294306
X.qus1 <- cbind(rep(1, N), x.qus1)
295-
betas.qus1 <- probit(y, X.qus1, b0 = 0, B0 = 10000)
307+
betas.qus1 <- probit(y, X.qus1, b0 = 0, B0 = 10000, burnin=1000, M=M)
296308
297-
plot(betas.qus1[, 1], type = "l", main = "", xlab = "", ylab = "")
298-
acf(betas.qus1[, 1])
309+
plot(betas.qus1[, 1], type = "l", xlab="Draws after burnin", ylab = labels[1])
310+
acf(betas.qus1[, 1],ylab="empirical ACF")
299311
300-
plot(betas.qus1[, 2], type = "l", main = "", xlab = "", ylab = "")
301-
acf(betas.qus1[, 2])
312+
plot(betas.qus1[, 2], type = "l", xlab= "Draws after burnin", ylab = labels[2])
313+
acf(betas.qus1[, 2],ylab="empirical ACF")
302314
303-
coda::effectiveSize(betas.qus1)
315+
(ess.qus1 <- round(coda::effectiveSize(betas.qus1),2))
316+
round(M/ ess.qus1,2)
304317
```
318+
Again autocorrelations are very high for both the intercept as well as
319+
the covariate effect resulting in high inefficiency factors of ~ 2340.
320+
305321

306-
If we change the setting so that $x$ takes values of $0$ not only for
307-
failures but also for some successes, whereas $x=1$ for all successes,
308-
we observe low autocorrelations for the intercept but still very high
309-
autocorrelations for the covariate effect. Also the ESSs are high for
310-
the intercept and low for the covariate effect.
322+
We now change the setting so that $x$ takes values of $0$ not only for
323+
failures but also for some successes, whereas $x=1$ for all successes.
311324

312325
```{r, echo = -c(1:2)}
313326
if (pdfplots) {
@@ -319,17 +332,23 @@ table(x.qus2, y)
319332
320333
set.seed(1234)
321334
X.qus2 <- cbind(rep(1, N), x.qus2)
322-
betas.qus2 <- probit(y, X.qus2, b0 = 0, B0 = 10000)
335+
betas.qus2 <- probit(y, X.qus2, b0 = 0, B0 = 10000, burnin=1000, M=M)
323336
324-
par(mfrow = c(2, 2), mar = c(2.5, 1.5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
325-
plot(betas.qus2[, 1], type = "l", main = "", xlab = "", ylab = "")
326-
acf(betas.qus2[, 1])
337+
par(mfrow = c(2, 2), mar = c(2.5, 2.5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
338+
plot(betas.qus2[, 1], type = "l", xlab="Draws after burnin", ylab = labels[1])
339+
acf(betas.qus2[, 1], ylab="empirical ACF")
327340
328-
plot(betas.qus2[, 2], type = "l", main = "", xlab = "", ylab = "")
329-
acf(betas.qus2[, 2])
341+
plot(betas.qus2[, 2], type = "l", xlab="Draws after burnin", ylab = labels[2])
342+
acf(betas.qus2[, 2], ylab="empirical ACF")
330343
331-
effectiveSize(betas.qus2)
344+
(ess.qus2 <- round(coda::effectiveSize(betas.qus2),2))
345+
(ineff.qus2 <- round(M/ ess.qus2,2))
332346
```
347+
Autocorrelations of the intercept are low and close to zero for small lags but
348+
remain very high for the covariate effect. Hence we have a high ESS for
349+
the intercept (`r ess.qus2[1]`) and low for the covariate effect (`r ess.qus2[2]`), resulting in an inefficiency
350+
factor of `r ineff.qus2[1]` for the intercept, but of `r ineff.qus2[2]` for the
351+
effect of the covariate.
333352

334353
High autocorrelations typically indicate problems with the sampler. If
335354
there is complete or quasi-complete separation in the data, the
@@ -339,38 +358,57 @@ regression effects will result in an improper posterior
339358
distribution. Hence, a proper prior is required to avoid improper
340359
posteriors in case of separation.
341360

342-
We now analyze the data under a more informative prior, i.e.,
361+
We now analyze the data of example 8.4. under the more informative prior
343362
$\mathcal{N}(\mathbf{0}, \mathbf{I})$. With this prior we assume that both
344363
$P(y=1)$ and $P(y=0)$ have a prior probability of $\approx 0.95$ to be
345364
in the interval $[0.023, 0.977]$.
346365

366+
367+
We compare the estimation results to those from exercise 8.4, where the prior was
368+
$\mathcal{N}(\mathbf{0}, 10000 \mathbf{I})$
369+
347370
```{r}
348371
set.seed(1234)
349-
betas.sep1 <- probit(y, X.sep, b0 = 0, B0 = 1)
372+
betas.sep1 <- probit(y, X.sep, b0 = 0, B0 = 1, burnin=1000, M=M)
373+
374+
res_betas.sep <- t(apply(betas.sep, 2, res.mcmc))
375+
rownames(res_betas.sep) <- c("Intercept", "X")
376+
knitr:: kable(res_betas.sep)
350377
351378
res_betas.sep1 <- t(apply(betas.sep1, 2, res.mcmc))
352-
knitr::kable(round(res_betas.sep1, 3))
379+
rownames(res_betas.sep1)<- c("Intercept", "X")
380+
knitr:: kable(res_betas.sep1)
353381
```
354-
355-
In this case the autocorrelations are much lower and the ESSs are
356-
roughly 600.
382+
We see that with the tighter prior the estimates are more shrunk to zero.
357383

358384
```{r, echo = -c(1:2)}
359385
if (pdfplots) {
360386
pdf("8-1_6.pdf", width = 8, height = 5)
361387
}
362388
363-
par(mfrow = c(2, 2), mar = c(2.5, 1.5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
389+
par(mfrow = c(2, 2), mar = c(2.5, 2.5, 1.5, .1), mgp = c(1.5, .5, 0), lwd = 1.5)
364390
365-
plot(betas.sep1[, 1], type = "l", main = "", xlab = "", ylab = "")
366-
acf(betas.sep1[, 1])
391+
plot(betas.sep1[, 1], type = "l", xlab="Draws after burnin", ylab = labels[1])
392+
acf(betas.sep1[, 1], ylab="empirical ACF")
367393
368-
plot(betas.sep1[, 2], type = "l", main = "", xlab = "", ylab = "")
369-
acf(betas.sep1[, 2])
394+
plot(betas.sep1[, 2], type = "l", xlab="Draws after burnin", ylab = labels[2])
395+
acf(betas.sep1[, 2],ylab="empirical ACF")
370396
371-
effectiveSize(betas.sep1)
397+
(ess.sep <- round(effectiveSize(betas.sep),2))
398+
(ineff.sep <- round(M/ ess.sep,2))
399+
400+
(ess.sep1 <- round(effectiveSize(betas.sep1),2))
401+
(ineff.sep1 <- round(M/ ess.sep1,2))
402+
403+
ineff<- cbind(ineff.sep, ineff.sep1)
404+
colnames(ineff)<- c("N(0,10000)", "N(0,1)")
405+
rownames(ineff)<- c("Intercept", "X")
406+
knitr::kable(ineff)
372407
```
373408

409+
With a tighter prior the autocorrelations are much lower and hence the ESSs are higher and inefficiency is lower.
410+
411+
374412
## Section 8.1.2: Logit model
375413

376414
### Example 8.6: Labor market data

0 commit comments

Comments
 (0)