Fit and compare hierarchical loglinear models for complex survey data.

svyloglin(formula, design, ...)
# S3 method for svyloglin
update(object,formula,...)
# S3 method for svyloglin
anova(object,object1,...,integrate=FALSE)
# S3 method for anova.svyloglin
print(x,pval=c("F","saddlepoint","lincom","chisq"),...)
# S3 method for svyloglin
coef(object,...,intercept=FALSE)

Arguments

formula

Model formula

design

survey design object

object,object1

loglinear model from svyloglin

pval

p-value approximation: see Details

integrate

Compute the exact asymptotic p-value (slow)?

...

not used

intercept

Report the intercept?

x

anova object

Details

The loglinear model is fitted to a multiway table with probabilities estimated by svymean and with the sample size equal to the observed sample size, treating the resulting table as if it came from iid multinomial sampling, as described by Rao and Scott. The variance-covariance matrix does not include the intercept term, and so by default neither does the coef method. A Newton-Raphson algorithm is used, rather than iterative proportional fitting, so starting values are not needed.

The anova method computes the quantities that would be the score (Pearson) and likelihood ratio chi-squared statistics if the data were an iid sample. It computes four p-values for each of these, based on the exact asymptotic distribution (see pchisqsum), a saddlepoint approximateion to this distribution, a scaled chi-squared distribution, and a scaled F-distribution. When testing the two-way interaction model against the main-effects model in a two-way table the score statistic and p-values match the Rao-Scott tests computed by svychisq.

The anova method can only compare two models if they are for exactly the same multiway table (same variables and same order). The update method will help with this. It is also much faster to use update than svyloglin for a large data set: its time complexity depends only on the size of the model, not on the size of the data set.

It is not possible to fit a model using a variable created inline, eg I(x<10), since the multiway table is based on all variables used in the formula.

Value

Object of class "svyloglin"

References

Rao, JNK, Scott, AJ (1984) "On Chi-squared Tests For Multiway Contingency Tables with Proportions Estimated From Survey Data" Annals of Statistics 12:46-60.

Examples

 data(api)
 dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
 a<-svyloglin(~stype+comp.imp,dclus1)
 b<-update(a,~.^2)
 an<-anova(a,b)
 an
#> Analysis of Deviance Table
#>  Model 1: y ~ stype + comp.imp
#> Model 2: y ~ stype + comp.imp + stype:comp.imp 
#> Deviance= 8.376767 p= 0.0676402 
#> Score= 9.013874 p= 0.05738682 
 print(an, pval="saddlepoint")
#> Analysis of Deviance Table
#>  Model 1: y ~ stype + comp.imp
#> Model 2: y ~ stype + comp.imp + stype:comp.imp 
#> Deviance= 8.376767 p= 0.05376131 
#> Score= 9.013874 p= 0.05547346 

 ## Wald test
 regTermTest(b, ~stype:comp.imp)
#> Wald test for stype:comp.imp
#>  in update(a, ~.^2)
#> F =  6.382223  on  2  and  13  df: p= 0.011722 

 ## linear-by-linear association
 d<-update(a,~.+as.numeric(stype):as.numeric(comp.imp))
 an1<-anova(a,d)
 an1
#> Analysis of Deviance Table
#>  Model 1: y ~ stype + comp.imp
#> Model 2: y ~ stype + comp.imp + as.numeric(stype):as.numeric(comp.imp) 
#> Deviance= 6.739685 p= 0.03853491 
#> Score= 7.293848 p= 0.03089338