svyglm {survey} | R Documentation |
Fit a generalised linear model to data from a complex survey design, with inverse-probability weighting and design-based standard errors.
## S3 method for class 'survey.design': svyglm(formula, design, subset=NULL, ...) ## S3 method for class 'svyrep.design': svyglm(formula, design, subset=NULL, ..., rho=NULL, return.replicates=FALSE, na.action,multicore=getOption("survey.multicore")) ## S3 method for class 'svyglm': summary(object, correlation = FALSE, df.resid=NULL, ...) ## S3 method for class 'svyglm': predict(object,newdata=NULL,total=NULL, type=c("link","response","terms"), se.fit=(type != "terms"),vcov=FALSE,...)
formula |
Model formula |
design |
Survey design from svydesign or svrepdesign . Must contain all variables
in the formula |
subset |
Expression to select a subpopulation |
... |
Other arguments passed to glm or
summary.glm |
rho |
For replicate BRR designs, to specify the parameter for
Fay's variance method, giving weights of rho and 2-rho |
return.replicates |
Return the replicates as a component of the result? |
object |
A svyglm object |
correlation |
Include the correlation matrix of parameters? |
na.action |
Handling of NAs |
multicore |
Use the multicore package to distribute
replicates across processors? |
df.resid |
Optional denominator degrees of freedom for Wald tests |
newdata |
new data frame for prediction |
total |
population size when predicting population total |
type |
linear predictor (link ) or response |
se.fit |
if TRUE , return variances of predictions |
vcov |
if TRUE and se=TRUE return full
variance-covariance matrix of predictions |
There is no anova
method for svyglm
as the models are not
fitted by maximum likelihood. The function regTermTest
may
be useful for testing sets of regression terms.
For binomial and Poisson families use family=quasibinomial()
and family=quasipoisson()
to avoid a warning about non-integer
numbers of successes. The `quasi' versions of the family objects give
the same point estimates and standard errors and do not give the
warning.
If df.resid
is not specified the df for the null model is
computed by degf
and the residual df computed by
subtraction. It's not that these are particularly good approximations
in a regression model but they are relatively standard. To get tests
based on a Normal distribution use df.resid=Inf
.
Parallel processing with multicore=TRUE
is helpful only for
fairly large data sets and on computers with sufficient memory. It may
be incompatible with GUIs, although the Mac Aqua GUI appears to be safe.
predict
gives fitted values and sampling variability for specific new
values of covariates. When newdata
are the population mean it
gives the regression estimator of the mean, and when newdata
are
the population totals and total
is specified it gives the
regression estimator of the population total. Regression estimators of
mean and total can also be obtained with calibrate
.
svyglm
returns an object of class svyglm
. The
predict
method returns an object of class svystat
Thomas Lumley
glm
, which is used to do most of the work.
regTermTest
, for multiparameter tests
calibrate
, for an alternative way to specify regression
estimators of population totals or means
svyttest
for one-sample and two-sample t-tests.
data(api) dstrat<-svydesign(id=~1,strata=~stype, weights=~pw, data=apistrat, fpc=~fpc) dclus2<-svydesign(id=~dnum+snum, weights=~pw, data=apiclus2) rstrat<-as.svrepdesign(dstrat) rclus2<-as.svrepdesign(dclus2) summary(svyglm(api00~ell+meals+mobility, design=dstrat)) summary(svyglm(api00~ell+meals+mobility, design=dclus2)) summary(svyglm(api00~ell+meals+mobility, design=rstrat)) summary(svyglm(api00~ell+meals+mobility, design=rclus2)) ## use quasibinomial, quasipoisson to avoid warning messages summary(svyglm(sch.wide~ell+meals+mobility, design=dstrat, family=quasibinomial())) ## Compare regression and ratio estimation of totals api.ratio <- svyratio(~api.stu,~enroll, design=dstrat) pop<-data.frame(enroll=sum(apipop$enroll, na.rm=TRUE)) npop <- nrow(apipop) predict(api.ratio, pop$enroll) ## regression estimator is less efficient api.reg <- svyglm(api.stu~enroll, design=dstrat) predict(api.reg, newdata=pop, total=npop) ## same as calibration estimator svytotal(~api.stu, calibrate(dstrat, ~enroll, pop=c(npop, pop$enroll))) ## svyglm can also reproduce the ratio estimator api.reg2 <- svyglm(api.stu~enroll-1, design=dstrat, family=quasi(link="identity",var="mu")) predict(api.reg2, newdata=pop, total=npop) ## higher efficiency by modelling variance better api.reg3 <- svyglm(api.stu~enroll-1, design=dstrat, family=quasi(link="identity",var="mu^3")) predict(api.reg3, newdata=pop, total=npop) ## true value sum(apipop$api.stu)