svyCprod {survey}R Documentation

Computations for survey variances


Computes the sum of products needed for the variance of survey sample estimators. svyCprod is used for survey design objects from before version 2.9, onestage is called by svyrecvar for post-2.9 design objects.


svyCprod(x, strata, psu, fpc, nPSU,certainty=NULL, postStrata=NULL,
onestage(x, strata, clusters, nPSU, fpc,


x A vector or matrix
strata A vector of stratum indicators (may be NULL for svyCprod)
psu A vector of cluster indicators (may be NULL)
clusters A vector of cluster indicators
fpc A data frame (svyCprod) or vector (onestage) of population stratum sizes, or NULL
nPSU Table (svyprod) or vector (onestage) of original sample stratum sizes (or NULL)
certainty logical vector with stratum names as names. If TRUE and that stratum has a single PSU it is a certainty PSU
postStrata Post-stratification variables
lonely.psu One of "remove", "adjust", "fail", "certainty", "average". See Details below
stage Used internally to track the depth of recursion
cal Used to pass calibration information at stages below the population


The observations for each cluster are added, then centered within each stratum and the outer product is taken of the row vector resulting for each cluster. This is added within strata, multiplied by a degrees-of-freedom correction and by a finite population correction (if supplied) and added across strata.

If there are fewer clusters (PSUs) in a stratum than in the original design extra rows of zeroes are added to x to allow the correct subpopulation variance to be computed.

See postStratify for information about post-stratification adjustments.

The variance formula gives 0/0 if a stratum contains only one sampling unit. If the certainty argument specifies that this is a PSU sampled with probability 1 (a "certainty" PSU) then it does not contribute to the variance (this is correct only when there is no subsampling within the PSU – otherwise it should be defined as a pseudo-stratum). If certainty is FALSE for this stratum or is not supplied the result depends on lonely.psu.

The options are "fail" to give an error, "remove" or "certainty" to give a variance contribution of 0 for the stratum, "adjust" to center the stratum at the grand mean rather than the stratum mean, and "average" to assign strata with one PSU the average variance contribution from strata with more than one PSU. The choice is controlled by setting options(survey.lonely.psu). If this is not done the factory default is "fail". Using "adjust" is conservative, and it would often be better to combine strata in some intelligent way. The properties of "average" have not been investigated thoroughly, but it may be useful when the lonely PSUs are due to a few strata having PSUs missing completely at random.

The "remove"and "certainty" options give the same result, but "certainty" is intended for situations where there is only one PSU in the population stratum, which is sampled with certainty (also called `self-representing' PSUs or strata). With "certainty" no warning is generated for strata with only one PSU. Ordinarily, svydesign will detect certainty PSUs, making this option unnecessary.

For strata with a single PSU in a subset (domain) the variance formula gives a value that is well-defined and positive, but not typically correct. If options("survey.adjust.domain.lonely") is TRUE and options("survey.lonely.psu") is "adjust" or "average", and no post-stratification or G-calibration has been done, strata with a single PSU in a subset will be treated like those with a single PSU in the sample. I am not aware of any theoretical study of this procedure, but it should at least be conservative.


A covariance matrix


Thomas Lumley


Binder, David A. (1983). On the variances of asymptotically normal estimators from complex surveys. International Statistical Review, 51, 279- 292.

See Also

svydesign, svyrecvar, surveyoptions, postStratify

[Package survey version 3.18 Index]