Post-stratification and calibration

Post-stratification, raking, and calibration (or GREG estimation) are related ways of using auxiliary information available on the whole population. These methods all involve adjusting the sampling weights so that the known population totals for auxiliary variables are reproduced exactly.

Post-stratification is done with the postStratify function, for survey designs with or without replicate weights. This example of post-stratification is based on one at UCLA Academic Technology Services, analysing data from Sampling of Populations by Levy and Lemeshow.

First a survey design object is created
 > dpets <- svydesign(id = ~1, weight = ~weight, data = pets, fpc = ~n)
Post-stratification requires data on the population distribution of the stratifying variable. This can be supplied in a data frame or a table, here a data frame is used. The other arguments to postStratify are the survey design object and a model formula specifying the stratifying variables.
 > pspets <- postStratify(dpets, ~type, data.frame(type = c(1, 2), 
      Freq = c(850, 450)))

The poststratified survey design object can be used in the same way as any other survey design object.For example, to show the precision gained by post-stratification:
 > svytotal(~totexp,dpets)
        total     SE
 totexp 51643 2888.3
 > svytotal(~totexp, pspets)
        total     SE
 totexp 52150 1512.5

Raking is a way to approximate post-stratification on a set of variables when only their marginal population distributions are known. Here we work with a one-stage cluster sample from the California Academic Performance Index, with the survey design object rclus1 created in an earlier example.

First define data frames giving the population totals for schools by type (elementary, middle, high) and by whether they met their school-wide growth target.
 pop.types <- data.frame(stype=c("E","H","M"), Freq=c(4421,755,1018))
 pop.schwide <- data.frame(sch.wide=c("No","Yes"), Freq=c(1072,5122))
Now apply these to rclus1. The first argument to rake is the survey design object. The second is a list of formulas specifying the variables to rake. The third argument is a list of data frames or tables giving the population distributions of each variable.
 rclus1r <- rake(rclus1, list(~stype,~sch.wide), list(pop.types, pop.schwide))

Raking can also be used when the joint distribution of some variables is known. Suppose we know the joint population distribution of school type and the "comparative improvement" target
pop.imptype<-with(apipop, table(comp.imp, stype))
We can then call rake specifying two margins: sch.wide and the combination of comp.imp and stype
dclus1r <- rake(dclus1, list(~comp.imp+stype,~sch.wide), list(pop.imptype, pop.schwide))

Calibration, or GREG estimation, allows continuous as well as discrete auxiliary variables. The method is motivated by regression estimation of a total but can be computed simply by reweighting, using the calibrate() function. The method is described for estimation of a total in Särndal et al Model Assisted Survey Ssampling; we use the same approach to estimate the total of estimating functions.

This example uses the same dclus1 survey design as above. First we calibrate to the school type variable. We supply a formula ~stype and must specify population totals for the design matrix produced by this formula.

 pop.totals<-c(`(Intercept)`=6194, stypeH=755, stypeM=1018)
 dclus1g<-calibrate(dclus1, ~stype, pop.totals)
We could also have given the formula as ~stype-1, with no intercept, in which case the population totals would have been simply the three stratum sizes.

Calibration to a single factor variable is equivalent to post-stratification, as we see from these examples:

> svymean(~api00, dclus1g)
        mean     SE
api00 642.31 23.921
> svytotal(~enroll, dclus1g)
         total     SE
enroll 3680893 406293
> svytotal(~stype, dclus1g)
       total        SE
stypeE  4421 2.308e-14
stypeH   755 5.985e-15
stypeM  1018 5.616e-14
We can calibrate using the two variables stype and sch.wide, as we did with raking above.
> dclus1g2 <- calibrate(dclus1, ~stype+sch.wide, c(pop.totals, sch.wideYes=5122))
> svymean(~api00, dclus1g2)
      mean     SE
api00  641 23.829
> svytotal(~enroll, dclus1g2)
         total     SE
enroll 3654414 403074
> svytotal(~stype, dclus1g2)
       total        SE
stypeE  4421 4.704e-14
stypeH   755 2.921e-15
stypeM  1018 3.821e-14
The results of calibration and raking are similar, but not identical.

Finally, we can look at a calibration example with a continuous variable. Imagine that the 1999 API is known for all schools but the 2000 API is known only for the sample. Since there is a high correlation between 1999 and 2000 API (see the graphics example for a scatterplot), calibrating to 1999 API reduces the uncertainty in mean 2000 API considerably.

> svymean(~api00, dclus1g3)
        mean     SE
api00 665.31 3.4418
> svytotal(~enroll, dclus1g3)
         total     SE
enroll 3638487 385524
>      svytotal(~stype, dclus1g3)
       total        SE
stypeE  4421 4.376e-14
stypeH   755 2.369e-14
stypeM  1018 2.140e-14
The reduction in uncertainty is less impressive for a subpopulation, as knowing the population total does not give complete information about the subpopulation total:
>  svymean(~api00, subset(dclus1g3, comp.imp=="Yes"))
        mean     SE
api00 672.05 6.5182

These examples have used a calibration working model with constant variance. Using a model with variance proportional to the covariate it is possible to express the standard ratio estimator of a total as a calibration estimator

> dstrat <- svydesign(id = ~1, strata = ~stype, weights = ~pw, 
+     data = apistrat, fpc = ~fpc)
> ratio <- svyratio(~api.stu, ~enroll, dstrat, separate = FALSE)
> predict(ratio, total=3811472)
$total
         enroll
api.stu 3190038

$se
          enroll
api.stu 29565.98
> dstratg <- calibrate(dstrat, ~enroll-1, pop=3811472, lambda=1)
> svytotal(~api.stu, dstratg)
          total    SE
api.stu 3190038 29566
The parameter lambda specifies a linear combination of the predictors that is proportional to the variance. With a single predictor any non-NULL value is equivalent.

Two refinements of regression calibration are the ability to impose bounds ont he calibration weights and to require that the weights are constant within sampling units.

In the example above, the changes in weights needed for calibration were substantial

> range(weights(dclus1g3)/weights(dclus1))
[1] 0.4185925 1.8332949
The original sampling weights were decreased by as much as 60% or increased by over 80%. We can attempt a bounded regression calibration, restricting the calibration weights to be positive and less than 1.5
>     (dclus1g3b <- calibrate(dclus1, ~stype+api99,
        c(pop.totals, api99=3914069), bounds=c(0,1.5)))
1 - level Cluster Sampling design
With (15) clusters.
calibrate(dclus1, ~stype + api99, c(pop.totals, api99 = 3914069), 
    bounds = c(0, 1.5))
>  range(weights(dclus1g3b)/weights(dclus1))
[1] 0.4970466 1.5000000
The bounds cannot be specified arbitrarily. Attempting a lower bound of 0.5 instead of 0 fails.
>  (dclus1g3b <- calibrate(dclus1, ~stype+api99,
     c(pop.totals, api99=3914069), bounds=c(0.5,1.5)))
Error in calibrate.survey.design2(dclus1, ~stype + api99, c(pop.totals,  : 
        Failed to achieve bounds in 11 iterations

Another useful restriction is to make weights constant within sampling units. For example, in a survey that samples households and collects data on all individuals in a household the true probability of being sampled must be identical for all individuals in the household and it is desirable for the sampling weight to also be identical.

In the one-stage cluster sample we have been using, all schools in a random sample of school districts are used, so the weights should be equal within school districts. Specifying aggregage.stage=1 requests that weights are constant within sampling units at the first stage of sampling.

> (dclus1agg<-calibrate(dclus1, ~stype, pop.totals, aggregate.stage=1))
1 - level Cluster Sampling design
With (15) clusters.
calibrate(dclus1, ~stype, pop.totals, aggregate.stage = 1)
The extra restrictions on weights will tend to mean that larger weights are needed to achieve calibration
>  range(weights(dclus1agg)/weights(dclus1))
[1] 0.3234162 2.2828519
>  range(weights(dclus1g)/weights(dclus1))
[1] 0.907064 1.593304
For designs specified by replicate weights the sampling units may be unknown. Rather than specifying aggregate.stage, specify a formula or vector of unit identifiers as the aggregate.index argument.
>  (rclus1agg3 <- calibrate(rclus1, ~stype+api99, c(pop.totals,api99=3914069),
   aggregate.index=~dnum))
Call: calibrate(rclus1, ~stype + api99, c(pop.totals, api99 = 3914069), 
    aggregate.index = ~dnum)
Unstratified cluster jacknife (JK1) with 15 replicates.
Generalized raking

calibrate has an argument calfun to specify the calibration function: linear, raking ratio, logit. An number of European agencies prefer logit calibration for producing fewer extreme weights

For example, we can perform logit calibration of the one-stage cluster sample:

> dclus1g3d <- calibrate(dclus1, ~stype + api99, c(pop.totals, 
+     api99 = 3914069), calfun = "logit", bounds = c(0.5, 2.5))
> summary(weights(dclus1g3d)/weights(dclus1))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.5944  0.7487  0.9473  1.0000  1.1740  1.9360 

Thomas Lumley
Last modified: Fri Jul 15 20:17:30 PDT 2005