xdesign.Rd
Defines a design object with multiple dimensions of correlation:
observations that share any of the id
variables are correlated,
or you can supply an adjacency matrix or Matrix to specify which are
correlated. Supports crossed designs (eg multiple raters of multiple
objects) and non-nested observational correlation (eg observations
sharing primary school or secondary school). Has methods for
svymean
, svytotal
, svyglm
(so far).
xdesign(id = NULL, strata = NULL, weights = NULL, data, fpc = NULL,
adjacency = NULL, overlap = c("unbiased", "positive"), allow.non.binary = FALSE)
list of formulas specifying cluster identifiers for each clustering dimension (or NULL
)
Not implemented
model formula specifying (sampling) weights
data frame containing all the variables
Not implemented
Adjacency matrix or Matrix indicating which pairs of observations are correlated
See details below
If FALSE
check that adjacency
is a binary 0/1 or
TRUE
/FALSE
matrix or Matrix.
Subsetting for these objects actually drops observations; it is not equivalent to just setting weights to zero as for survey designs. So, for example, a subset of a balanced design will not be a balanced design.
The overlap
option controls double-counting of some variance
terms. Suppose there are two clustering dimensions, ~a
and
~b
. If we compute variance matrices clustered on a
and
clustered on b
and add them, observations that share both
a
and b
will be counted twice, giving a positively
biased estimator. We can subtract off a variance matrix clustered
on combinations of a
and b
to give an unbiased
variance estimator. However, the unbiased estimator is not
guaranteed to be positive definite. In the references, Miglioretti
and Heagerty use the overlap="positive"
estimator and Cameron
et al use the overlap="unbiased"
estimator.
An object of class xdesign
Miglioretti D, Heagerty PJ (2007) Marginal modeling of nonnested multilevel data using standard software. Am J Epidemiol 165(4):453-63
Cameron, A. C., Gelbach, J. B., & Miller, D. L. (2011). Robust Inference With Multiway Clustering. Journal of Business & Economic Statistics, 29(2), 238-249.
https://notstatschat.rbind.io/2021/09/18/crossed-clustering-and-parallel-invention/
## With one clustering dimension, is close to the with-replacement
## survey estimator, but not identical unless clusters are equal size
data(api)
dclus1r<-svydesign(id=~dnum, weights=~pw, data=apiclus1)
xclus1<-xdesign(id=list(~dnum), weights=~pw, data=apiclus1)
#> Warning: only one clustering dimension?
xclus1
#> 1-way crossed design:
#> xdesign(id = list(~dnum), weights = ~pw, data = apiclus1)
svymean(~enroll,dclus1r)
#> mean SE
#> enroll 549.72 45.646
svymean(~enroll,xclus1)
#> mean SE
#> enroll 549.72 46.964
data(salamander)
xsalamander<-xdesign(id=list(~Male, ~Female), data=salamander,
overlap="unbiased")
xsalamander
#> 2-way crossed design:
#> xdesign(id = list(~Male, ~Female), data = salamander, overlap = "unbiased")
degf(xsalamander)
#> [1] 32.72727