Bootstrapping is often used to obtain a robust estimate of the uncertainty of a parameter due to sampling error.
The normal algorithm that Zelig uses to simulate quantities of interest is a form of the parametric bootstrap. Zelig has an argument, however, to switch to the nonparametric bootstrap. Hereafter, when we say bootstrap, we imply the nonparametric form.
The bootstrap argument has a default of FALSE, and can be set to TRUE or a numeric value giving the number of bootstrapped datasets to run. If set to TRUE the default is 100 bootstraps. The bootstrap works in combination with other Zelig arguments as follows:
Attach sample data:
data(macro)
Estimate the model, setting the number of bootstrapped datasets to construct:
z.out <- zls$new()
z.out$zelig(unem ~ gdp + capmob +
 trade, data = macro, bootstrap=500)
Summary by default shows the point parameter estimates, with the standard errors generated by the bootstrap.
summary(z.out)
## Model: Combined Bootstraps
##             Estimate Std.Error z value  Pr(>|z|)
## (Intercept)  6.18129  0.437869  14.117 0.000e+00 ***
## gdp         -0.32360  0.055803  -5.799 6.671e-09 ***
## capmob       1.42194  0.165019   8.617 0.000e+00 ***
## trade        0.01985  0.005335   3.721 1.981e-04  **
## ---
## Signif. codes:  '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## For results from individual bootstrapped datasets, use summary(x, subset = i:j)
## Next step: Use 'setx' method
We can instead choose to show the bagging estimator, that is, the average parameter value across all the bootstrapped datasets. Bagging generally trades bias for a reduction in variance that results in lower mean squared error (notably in non-linear models). The bagging estimator can be obtained as:
summary(z.out, bagging=TRUE)
## Model: Combined Bootstraps
##             Estimate Std.Error z value  Pr(>|z|)
## (Intercept)  6.16514  0.437869  14.080 0.000e+00 ***
## gdp         -0.32175  0.055803  -5.766 8.125e-09 ***
## capmob       1.41801  0.165019   8.593 0.000e+00 ***
## trade        0.01998  0.005335   3.744 1.809e-04  **
## ---
## Signif. codes:  '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## For results from individual bootstrapped datasets, use summary(x, subset = i:j)
## Next step: Use 'setx' method
If we want to inspect particular individual results, the subset argument is available:
summary(z.out, subset=13:15)
## Bootstrapped Dataset 13
## Call:
## z.out$zelig(formula = unem ~ gdp + capmob + trade, data = macro,
##     bootstrap = 500)
##
## Residuals:
##     Min      1Q  Median      3Q     Max
## -4.4731 -2.1510 -0.7507  1.8016  8.0866
##
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  5.341864   0.419883  12.722  < 2e-16
## gdp         -0.317835   0.062931  -5.050 7.14e-07
## capmob       0.915843   0.180335   5.079 6.23e-07
## trade        0.019740   0.005985   3.298  0.00107
##
## Residual standard error: 2.708 on 346 degrees of freedom
## Multiple R-squared:  0.2023,     Adjusted R-squared:  0.1954
## F-statistic: 29.26 on 3 and 346 DF,  p-value: < 2.2e-16
##
## Bootstrapped Dataset 14
## Call:
## z.out$zelig(formula = unem ~ gdp + capmob + trade, data = macro,
##     bootstrap = 500)
##
## Residuals:
##     Min      1Q  Median      3Q     Max
## -5.2987 -2.2535 -0.4098  2.0188  5.9420
##
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  5.930725   0.435651  13.613  < 2e-16
## gdp         -0.303369   0.062188  -4.878 1.64e-06
## capmob       1.414034   0.174193   8.118 8.43e-15
## trade        0.024247   0.005542   4.375 1.61e-05
##
## Residual standard error: 2.817 on 346 degrees of freedom
## Multiple R-squared:  0.2789,     Adjusted R-squared:  0.2726
## F-statistic: 44.61 on 3 and 346 DF,  p-value: < 2.2e-16
##
## Bootstrapped Dataset 15
## Call:
## z.out$zelig(formula = unem ~ gdp + capmob + trade, data = macro,
##     bootstrap = 500)
##
## Residuals:
##     Min      1Q  Median      3Q     Max
## -5.7300 -2.3095  0.0192  2.1121  7.1824
##
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  6.834144   0.490435  13.935  < 2e-16
## gdp         -0.314014   0.067781  -4.633 5.12e-06
## capmob       1.581995   0.176602   8.958  < 2e-16
## trade        0.013709   0.006002   2.284    0.023
##
## Residual standard error: 2.837 on 346 degrees of freedom
## Multiple R-squared:  0.2667,     Adjusted R-squared:  0.2604
## F-statistic: 41.95 on 3 and 346 DF,  p-value: < 2.2e-16
##
## Next step: Use 'setx' method
If bootstraps were obtained, the first results are those models on the bootstrapped data, and the -th result is the model estimated on the original data. The value of is stored in a field in the Zelig object, named bootstrap.num. In our running example this was 500:
summary(z.out, subset=(z.out$bootstrap.num + 1))
## Bootstrapped Dataset 501
## Call:
## z.out$zelig(formula = unem ~ gdp + capmob + trade, data = macro,
##     bootstrap = 500)
##
## Residuals:
##     Min      1Q  Median      3Q     Max
## -5.3008 -2.0768 -0.3187  1.9789  7.7715
##
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  6.181294   0.450572  13.719  < 2e-16
## gdp         -0.323601   0.062820  -5.151 4.36e-07
## capmob       1.421939   0.166443   8.543 4.22e-16
## trade        0.019854   0.005606   3.542 0.000452
##
## Residual standard error: 2.746 on 346 degrees of freedom
## Multiple R-squared:  0.2878,     Adjusted R-squared:  0.2817
## F-statistic: 46.61 on 3 and 346 DF,  p-value: < 2.2e-16
##
## Next step: Use 'setx' method