Defines a design object with multiple dimensions of correlation: observations that share any of the id variables are correlated, or you can supply an adjacency matrix or Matrix to specify which are correlated. Supports crossed designs (eg multiple raters of multiple objects) and non-nested observational correlation (eg observations sharing primary school or secondary school). Has methods for svymean, svytotal, svyglm (so far).

xdesign(id = NULL, strata = NULL, weights = NULL, data, fpc = NULL,
adjacency = NULL, overlap = c("unbiased", "positive"), allow.non.binary = FALSE)

Arguments

id

list of formulas specifying cluster identifiers for each clustering dimension (or NULL)

strata

Not implemented

weights

model formula specifying (sampling) weights

data

data frame containing all the variables

fpc

Not implemented

adjacency

Adjacency matrix or Matrix indicating which pairs of observations are correlated

overlap

See details below

allow.non.binary

If FALSE check that adjacency is a binary 0/1 or TRUE/FALSE matrix or Matrix.

Details

Subsetting for these objects actually drops observations; it is not equivalent to just setting weights to zero as for survey designs. So, for example, a subset of a balanced design will not be a balanced design.

The overlap option controls double-counting of some variance terms. Suppose there are two clustering dimensions, ~a and ~b. If we compute variance matrices clustered on a and clustered on b and add them, observations that share both a and b will be counted twice, giving a positively biased estimator. We can subtract off a variance matrix clustered on combinations of a and b to give an unbiased variance estimator. However, the unbiased estimator is not guaranteed to be positive definite. In the references, Miglioretti and Heagerty use the overlap="positive" estimator and Cameron et al use the overlap="unbiased" estimator.

Value

An object of class xdesign

References

Miglioretti D, Heagerty PJ (2007) Marginal modeling of nonnested multilevel data using standard software. Am J Epidemiol 165(4):453-63

Cameron, A. C., Gelbach, J. B., & Miller, D. L. (2011). Robust Inference With Multiway Clustering. Journal of Business & Economic Statistics, 29(2), 238-249.

https://notstatschat.rbind.io/2021/09/18/crossed-clustering-and-parallel-invention/

See also

Examples



## With one clustering dimension, is close to the with-replacement
##   survey estimator, but not identical unless clusters are equal size
data(api)
dclus1r<-svydesign(id=~dnum, weights=~pw, data=apiclus1)
xclus1<-xdesign(id=list(~dnum), weights=~pw, data=apiclus1)
#> Warning: only one clustering dimension?
xclus1
#> 1-way crossed design:
#> xdesign(id = list(~dnum), weights = ~pw, data = apiclus1)

svymean(~enroll,dclus1r)
#>          mean     SE
#> enroll 549.72 45.646
svymean(~enroll,xclus1)
#>          mean     SE
#> enroll 549.72 46.964

data(salamander)
xsalamander<-xdesign(id=list(~Male, ~Female), data=salamander,
    overlap="unbiased")
xsalamander
#> 2-way crossed design:
#> xdesign(id = list(~Male, ~Female), data = salamander, overlap = "unbiased")
degf(xsalamander)
#> [1] 32.72727