Computes the sum of products needed for the variance of survey sample estimators. svyCprod is used for survey design objects from before version 2.9, onestage is called by svyrecvar for post-2.9 design objects.

svyCprod(x, strata, psu, fpc, nPSU,certainty=NULL, postStrata=NULL,
      lonely.psu=getOption("survey.lonely.psu"))
onestage(x, strata, clusters, nPSU, fpc,
      lonely.psu=getOption("survey.lonely.psu"),stage=0,cal)

Arguments

x

A vector or matrix

strata

A vector of stratum indicators (may be NULL for svyCprod)

psu

A vector of cluster indicators (may be NULL)

clusters

A vector of cluster indicators

fpc

A data frame (svyCprod) or vector (onestage) of population stratum sizes, or NULL

nPSU

Table (svyprod) or vector (onestage) of original sample stratum sizes (or NULL)

certainty

logical vector with stratum names as names. If TRUE and that stratum has a single PSU it is a certainty PSU

postStrata

Post-stratification variables

lonely.psu

One of "remove", "adjust", "fail", "certainty", "average". See Details below

stage

Used internally to track the depth of recursion

cal

Used to pass calibration information at stages below the population

Details

The observations for each cluster are added, then centered within each stratum and the outer product is taken of the row vector resulting for each cluster. This is added within strata, multiplied by a degrees-of-freedom correction and by a finite population correction (if supplied) and added across strata.

If there are fewer clusters (PSUs) in a stratum than in the original design extra rows of zeroes are added to x to allow the correct subpopulation variance to be computed.

See postStratify for information about post-stratification adjustments.

The variance formula gives 0/0 if a stratum contains only one sampling unit. If the certainty argument specifies that this is a PSU sampled with probability 1 (a "certainty" PSU) then it does not contribute to the variance (this is correct only when there is no subsampling within the PSU -- otherwise it should be defined as a pseudo-stratum). If certainty is FALSE for this stratum or is not supplied the result depends on lonely.psu.

The options are "fail" to give an error, "remove" or "certainty" to give a variance contribution of 0 for the stratum, "adjust" to center the stratum at the grand mean rather than the stratum mean, and "average" to assign strata with one PSU the average variance contribution from strata with more than one PSU. The choice is controlled by setting options(survey.lonely.psu). If this is not done the factory default is "fail". Using "adjust" is conservative, and it would often be better to combine strata in some intelligent way. The properties of "average" have not been investigated thoroughly, but it may be useful when the lonely PSUs are due to a few strata having PSUs missing completely at random.

The "remove"and "certainty" options give the same result, but "certainty" is intended for situations where there is only one PSU in the population stratum, which is sampled with certainty (also called `self-representing' PSUs or strata). With "certainty" no warning is generated for strata with only one PSU. Ordinarily, svydesign will detect certainty PSUs, making this option unnecessary.

For strata with a single PSU in a subset (domain) the variance formula gives a value that is well-defined and positive, but not typically correct. If options("survey.adjust.domain.lonely") is TRUE and options("survey.lonely.psu") is "adjust" or "average", and no post-stratification or G-calibration has been done, strata with a single PSU in a subset will be treated like those with a single PSU in the sample. I am not aware of any theoretical study of this procedure, but it should at least be conservative.

Value

A covariance matrix

Author

Thomas Lumley

References

Binder, David A. (1983). On the variances of asymptotically normal estimators from complex surveys. International Statistical Review, 51, 279- 292.