R: Survey statistics on subsets

svyby {survey}

R Documentation

Survey statistics on subsets

Description

Compute survey statistics on subsets of a survey defined by factors.

Usage

svyby(formula, by ,design,...)
## Default S3 method:
svyby(formula, by, design, FUN, ..., deff=FALSE,keep.var = TRUE,
keep.names = TRUE,verbose=FALSE, vartype=c("se","ci","ci","cv","cvpct","var"),
 drop.empty.groups=TRUE, covmat=FALSE, return.replicates=FALSE, multicore=getOption("survey.multicore"))
## S3 method for class 'svyby':
SE(object,...)
## S3 method for class 'svyby':
deff(object,...)
## S3 method for class 'svyby':
coef(object,...)
unwtd.count(x, design, ...)

Arguments

`formula,x`	A formula specifying the variables to pass to `FUN` (or a matrix, data frame, or vector)
`by`	A formula specifying factors that define subsets, or a list of factors.
`design`	A `svydesign` or `svrepdesign` object
`FUN`	A function taking a formula and survey design object as its first two arguments.
`...`	Other arguments to `FUN`
`deff`	Request a design effect from `FUN`
`keep.var`	If `FUN` returns a `svystat` object, extract standard errors from it
`keep.names`	Define row names based on the subsets
`verbose`	If `TRUE`, print a label for each subset as it is processed.
`vartype`	Report variability as one or more of standard error, confidence interval, coefficient of variation, percent coefficient of variation, or variance
`drop.empty.groups`	If `FALSE`, report `NA` for empty groups, if `TRUE` drop them from the output
`covmat`	If `TRUE`, compute covariances between estimates for different subsets (currently only for replicate-weight designs). Allows `svycontrast` to be used on output.
`return.replicates`	Only for replicate-weight designs. If `TRUE`, return all the replicates as an attribute of the result
`multicore`	Use `multicore` package to distribute subsets over multiple processors?
`object`	An object of class `"svyby"`

Details

The variance type "ci" asks for confidence intervals, which are produced by confint. In some cases additional options to FUN will be needed to produce confidence intervals, for example, svyquantile needs ci=TRUE

unwtd.count is designed to be passed to svyby to report the number of non-missing observations in each subset. Observations with exactly zero weight will also be counted as missing, since that's how subsets are implemented for some designs.

Parallel processing with multicore=TRUE is useful only for fairly large problems and on computers with sufficient memory. The multicore package is incompatible with some GUIs, although the Mac Aqua GUI appears to be safe.

Value

An object of class "svyby": a data frame showing the factors and the results of FUN.
For unwtd.count, the unweighted number of non-missing observations in the data matrix specified by x for the design.

Note

Asking for a design effect (deff=TRUE) from a function that does not produce one will cause an error or incorrect formatting of the output. The same will occur with keep.var=TRUE if the function does not compute a standard error.

Examples

data(api)
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)

svyby(~api99, ~stype, dclus1, svymean)
svyby(~api99, ~stype, dclus1, svyquantile, quantiles=0.5,ci=TRUE,vartype="ci")
## without ci=TRUE svyquantile does not compute standard errors
svyby(~api99, ~stype, dclus1, svyquantile, quantiles=0.5, keep.var=FALSE)
svyby(~api99, list(school.type=apiclus1$stype), dclus1, svymean)
svyby(~api99+api00, ~stype, dclus1, svymean, deff=TRUE,vartype="ci")
svyby(~api99+api00, ~stype+sch.wide, dclus1, svymean, keep.var=FALSE)
## report raw number of observations
svyby(~api99+api00, ~stype+sch.wide, dclus1, unwtd.count, keep.var=FALSE)

rclus1<-as.svrepdesign(dclus1)

svyby(~api99, ~stype, rclus1, svymean)
svyby(~api99, ~stype, rclus1, svyquantile, quantiles=0.5)
svyby(~api99, list(school.type=apiclus1$stype), rclus1, svymean, vartype="cv")
svyby(~enroll,~stype, rclus1,svytotal, deff=TRUE)
svyby(~api99+api00, ~stype+sch.wide, rclus1, svymean, keep.var=FALSE)
##report raw number of observations
svyby(~api99+api00, ~stype+sch.wide, rclus1, unwtd.count, keep.var=FALSE)

## comparing subgroups using covmat=TRUE
mns<-svyby(~api99, ~stype, rclus1, svymean,covmat=TRUE)
vcov(mns)
svycontrast(mns, c(E = 1, M = -1))

str(svyby(~api99, ~stype, rclus1, svymean,return.replicates=TRUE))

## extractor functions
(a<-svyby(~enroll, ~stype, rclus1, svytotal, deff=TRUE, verbose=TRUE, vartype=c("se","cv","cvpct","var")))
deff(a)
SE(a)
cv(a)
coef(a)

## ratio estimates
svyby(~api.stu, by=~stype, denominator=~enroll, design=dclus1, svyratio)

## empty groups
svyby(~api00,~comp.imp+sch.wide,design=dclus1,svymean)
svyby(~api00,~comp.imp+sch.wide,design=dclus1,svymean,drop.empty.groups=FALSE)

[Package survey version 3.18 Index]