Installation and Quickstart

This guide is designed to get you up and running with the current release of Zelig (5.1.0).


Installing R and Zelig

Before using Zelig, you will need to download and install both the R statistical program and the Zelig package:

Installing R

To install R, go to http://www.r-project.org/ Select the CRAN option from the left-hand menu (CRAN is the Comprehensive R Archive Network where all files related to R can be found). Pick a CRAN mirror closest to your current geographic location (there are multiple mirrors of this database in various locations, selecting the one closest to you will be sure to maximize your the speed of your download). Follow the instructions for downloading R for Linux, Mac OS X, or Windows.


Installing Zelig

Zelig 5 is available on CRAN, as is the ZeligChoice package containing additional models. To install Zelig, or other Zelig add-on modules, from CRAN enter the following at the R prompt:

install.packages("Zelig")
install.packages("ZeligChoice")
install.packages("ZeligEI")

at which point you will be asked to select from a list a local CRAN mirror to use for downloading.

Beta Release

Beta releases are updated with the latest fixes and newest experimental features, and generally reflect a copy currently being tested before release on CRAN. To download this release, enter one of the following into an R console:

If you already have all of the Zelig dependencies installed (for example, if you have a prior version of Zelig running and are seeking to replace it with the Beta version):

install.packages("Zelig", type = "source", repos = "http://r.iq.harvard.edu/")

If you do not already have Zelig installed, and anticipate having missing packages dependencies, then add CRAN as a secondary repository by using:

install.packages("Zelig", type = "both", repos = c("http://r.iq.harvard.edu/", "http://cran.stat.ucla.edu/"))

(perhaps substituting the URL of your favorite local CRAN mirror). When prompted, answer yes to the question of whether “to install from sources the package which needs compilation?”

The ZeligGAM package, containing additional Zelig models described on this site, is presently only available in beta, so if you wish to use these models, similarly install this as:

install.packages("ZeligGAM", type = "source", repos = "http://r.iq.harvard.edu/")

Development Release

Development versions contain the latest code in-development, and may not be fully tested. The following example downloads Zelig, and substitute the desired package name for any other package:

# This installs devtools package, if not already installed
install.packages("devtools")
# This loads devtools
library(devtools)
# This downloads the latest Zelig from the IQSS Github repo
install_github('IQSS/Zelig')

If you have successfully installed the program, you will see the following message: “DONE (Zelig)”.


Quickstart Guide

Once Zelig is successfully downloaded and installed, let’s load the package and walk through an example. The swiss dataset contains data on fertility and socioeconomic factors in Switzerland’s 47 French-speaking provinces in 1888 (Mosteller and Tukey, 1977, 549-551). We will model the effect of education on fertility, where education is measured as the percent of draftees with education beyond primary school and fertility is measured using the common standardized fertility measure (see Muehlenbein (2010, 80-81) for details).

Loading Zelig

First, open your R console and load Zelig:

library(Zelig)

Building Models

Let’s assume we want to estimate the effect of education on fertility. Since fertility is a continuous variable, least squares is an appropriate model choice. We first create a Zelig least squares object:

# initialize Zelig5 least squares object
z5 <- zls$new()

To estimate our model, we call the zelig() method, which is a function that is internal to the Zelig object. We pass the zelig() method two arguments: equation and data:

# estimate ls model
z5$zelig(Fertility ~ Education, data = swiss)
# model summary
summary(z5)
## Model:
##
## Call:
## z5$zelig(formula = Fertility ~ Education, data = swiss)
##
## Residuals:
##     Min      1Q  Median      3Q     Max
## -17.036  -6.711  -1.011   9.526  19.689
##
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  79.6101     2.1041  37.836  < 2e-16
## Education    -0.8624     0.1448  -5.954 3.66e-07
##
## Residual standard error: 9.446 on 45 degrees of freedom
## Multiple R-squared:  0.4406,     Adjusted R-squared:  0.4282
## F-statistic: 35.45 on 1 and 45 DF,  p-value: 3.659e-07
##
## Next step: Use 'setx' method

The -0.8624 coefficient on education suggests a negative relationship between the education of a province and its fertility rate. More precisely, for every one percent increase in draftees educated beyond primary school, the fertility rate of the province decreases 0.8624 units. To help us better interpret this finding, we may want other quantities of interest, such as expected values or first differences. Zelig makes this simple by automating the translation of model estimates into interpretable quantities of interest using Monte Carlo simulation methods (see King, Tomz, and Wittenberg (2000) for more information). For example, let’s say we want to examine the effect of increasing the percent of draftees educated from 5 to 15. To do so, we set our predictor value using the setx() method:

# set education to 5
z5$setx(Education = 5)
# set education to 15
z5$setx1(Education = 15)
# model summary
summary(z5)
## setx:
##   (Intercept) Education
## 1           1         5
## setx1:
##   (Intercept) Education
## 1           1        15
##
## Next step: Use 'sim' method

After setting our predictor value, we simulate using the sim() method:

# run simulations and estimate quantities of interest
z5$sim()
## Error: Not a Zelig object.
# model summary
summary(z5)
## setx:
##   (Intercept) Education
## 1           1         5
## setx1:
##   (Intercept) Education
## 1           1        15
##
## Next step: Use 'sim' method

At this point, we’ve estimated a model, set the predictor value, and estimated easily interpretable quantities of interest. The summary() method shows us our quantities of interest, namely, our expected and predicted values at each level of education, as well as our first differences-the difference in expected values at the set levels of education.


Visualizations

Zelig’s graph() method plots the estimated quantities of interest:

z5$graph()
## Error: No simulated quantities of interest found.

Help

Finally, model documentation can be accessed using the z5$help() method at any point after a model object has been initialized:

# documentation for least squares model
z5$help()
# documentation for logistic regression
z5 <- zlogit$new()
z5$help()