Raking uses iterative post-stratification to match marginal distributions of a survey sample to known population margins.

rake(design, sample.margins, population.margins, control = list(maxit =
10, epsilon = 1, verbose=FALSE), compress=NULL)

Arguments

design

A survey object

sample.margins

list of formulas or data frames describing sample margins, which must not contain missing values

population.margins

list of tables or data frames describing corresponding population margins

control

maxit controls the number of iterations. Convergence is declared if the maximum change in a table entry is less than epsilon. If epsilon<1 it is taken to be a fraction of the total sampling weight.

compress

If design has replicate weights, attempt to compress the new replicate weight matrix? When NULL, will attempt to compress if the original weight matrix was compressed

Details

The sample.margins should be in a format suitable for postStratify.

Raking (aka iterative proportional fitting) is known to converge for any table without zeros, and for any table with zeros for which there is a joint distribution with the given margins and the same pattern of zeros. The `margins' need not be one-dimensional.

The algorithm works by repeated calls to postStratify (iterative proportional fitting), which is efficient for large multiway tables. For small tables calibrate will be faster, and also allows raking to population totals for continuous variables, and raking with bounded weights.

Value

A raked survey design.

See also

postStratify, compressWeights

calibrate for other ways to use auxiliary information.

Examples

data(api)
dclus1 <- svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
rclus1 <- as.svrepdesign(dclus1)

svymean(~api00, rclus1)
#>         mean     SE
#> api00 644.17 26.329
svytotal(~enroll, rclus1)
#>          total     SE
#> enroll 3404940 932235

## population marginal totals for each stratum
pop.types <- data.frame(stype=c("E","H","M"), Freq=c(4421,755,1018))
pop.schwide <- data.frame(sch.wide=c("No","Yes"), Freq=c(1072,5122))

rclus1r <- rake(rclus1, list(~stype,~sch.wide), list(pop.types, pop.schwide))

svymean(~api00, rclus1r)
#>         mean     SE
#> api00 641.23 26.873
svytotal(~enroll, rclus1r)
#>          total     SE
#> enroll 3647300 463511

## marginal totals correspond to population
xtabs(~stype, apipop)
#> stype
#>    E    H    M 
#> 4421  755 1018 
svytable(~stype, rclus1r, round=TRUE)
#> stype
#>    E    H    M 
#> 4421  755 1018 
xtabs(~sch.wide, apipop)
#> sch.wide
#>   No  Yes 
#> 1072 5122 
svytable(~sch.wide, rclus1r, round=TRUE)
#> sch.wide
#>   No  Yes 
#> 1072 5122 

## joint totals don't correspond 
xtabs(~stype+sch.wide, apipop)
#>      sch.wide
#> stype   No  Yes
#>     E  472 3949
#>     H  334  421
#>     M  266  752
svytable(~stype+sch.wide, rclus1r, round=TRUE)
#>      sch.wide
#> stype   No  Yes
#>     E  478 3943
#>     H  201  554
#>     M  393  625

## Do it for a design without replicate weights
dclus1r<-rake(dclus1, list(~stype,~sch.wide), list(pop.types, pop.schwide))

svymean(~api00, dclus1r)
#>         mean     SE
#> api00 641.23 23.704
svytotal(~enroll, dclus1r)
#>          total     SE
#> enroll 3647300 400603

## compare to raking with calibrate()
dclus1gr<-calibrate(dclus1, ~stype+sch.wide, pop=c(6194, 755,1018,5122),
           calfun="raking")
svymean(~stype+api00, dclus1r)
#>             mean     SE
#> stypeE   0.71375  0.000
#> stypeH   0.12189  0.000
#> stypeM   0.16436  0.000
#> api00  641.23025 23.704
svymean(~stype+api00, dclus1gr)
#>             mean     SE
#> stypeE   0.71376  0.000
#> stypeH   0.12189  0.000
#> stypeM   0.16435  0.000
#> api00  641.23032 23.704

## compare to joint post-stratification
## (only possible if joint population table is known)
##
pop.table <- xtabs(~stype+sch.wide,apipop)
rclus1ps <- postStratify(rclus1, ~stype+sch.wide, pop.table)
svytable(~stype+sch.wide, rclus1ps, round=TRUE)
#>      sch.wide
#> stype   No  Yes
#>     E  472 3949
#>     H  334  421
#>     M  266  752

svymean(~api00, rclus1ps)
#>         mean     SE
#> api00 643.15 27.102
svytotal(~enroll, rclus1ps)
#>          total     SE
#> enroll 3610219 468086

## Example of raking with partial joint distributions
pop.imp<-data.frame(comp.imp=c("No","Yes"),Freq=c(1712,4482))
dclus1r2<-rake(dclus1, list(~stype+sch.wide, ~comp.imp),
               list(pop.table, pop.imp))
svymean(~api00, dclus1r2)
#>         mean     SE
#> api00 642.62 22.732

## compare to calibrate() syntax with tables
dclus1r2<-calibrate(dclus1, formula=list(~stype+sch.wide, ~comp.imp),
               population=list(pop.table, pop.imp),calfun="raking")
svymean(~api00, dclus1r2)
#>         mean     SE
#> api00 642.61 22.731