`svrepdesign.Rd`

Some recent large-scale surveys specify replication weights rather than the sampling design (partly for privacy reasons). This function specifies the data structure for such a survey.

```
svrepdesign(variables , repweights , weights, data,...)
# S3 method for default
svrepdesign(variables = NULL, repweights = NULL, weights = NULL,
data = NULL, type = c("BRR", "Fay", "JK1","JKn","bootstrap",
"ACS","successive-difference","JK2","other"),
combined.weights=TRUE, rho = NULL, bootstrap.average=NULL,
scale=NULL, rscales=NULL,fpc=NULL, fpctype=c("fraction","correction"),
mse=getOption("survey.replicates.mse"),...)
# S3 method for imputationList
svrepdesign(variables=NULL, repweights,weights,data,
mse=getOption("survey.replicates.mse"),...)
# S3 method for character
svrepdesign(variables=NULL,repweights=NULL, weights=NULL,data=NULL,
type=c("BRR","Fay","JK1", "JKn","bootstrap","ACS","successive-difference","JK2","other"),
combined.weights=TRUE, rho=NULL, bootstrap.average=NULL, scale=NULL,rscales=NULL,
fpc=NULL,fpctype=c("fraction","correction"),mse=getOption("survey.replicates.mse"),
dbtype="SQLite", dbname,...)
# S3 method for svyrep.design
image(x, ...,
col=grey(seq(.5,1,length=30)), type.=c("rep","total"))
```

- variables
formula or data frame specifying variables to include in the design (default is all)

- repweights
formula or data frame specifying replication weights, or character string specifying a regular expression that matches the names of the replication weight variables

- weights
sampling weights

- data
data frame to look up variables in formulas, or character string giving name of database table

- type
Type of replication weights

- combined.weights
`TRUE`

if the`repweights`

already include the sampling weights. This is usually the case.- rho
Shrinkage factor for weights in Fay's method

- bootstrap.average
For

`type="bootstrap"`

, if the bootstrap weights have been averaged, gives the number of iterations averaged over- scale, rscales
Scaling constant for variance, see Details below

- fpc,fpctype
Finite population correction information

- mse
If

`TRUE`

, compute variances based on sum of squares around the point estimate, rather than the mean of the replicates- dbname
name of database, passed to

`DBI::dbConnect()`

- dbtype
Database driver: see Details

- x
survey design with replicate weights

- ...
Other arguments to

`image`

- col
Colors

- type.
`"rep"`

for only the replicate weights,`"total"`

for the replicate and sampling weights combined.

In the BRR method, the dataset is split into halves, and the
difference between halves is used to estimate the variance. In Fay's
method, rather than removing observations from half the sample they
are given weight `rho`

in one half-sample and `2-rho`

in the
other. The ideal BRR analysis is restricted to a design where each
stratum has two PSUs, however, it has been used in a much wider class
of surveys. The `scale`

and `rscales`

arguments will be ignored (with a warning) if they are specified.

The JK1 and JKn types are both jackknife estimators deleting one cluster at a time. JKn is designed for stratified and JK1 for unstratified designs.

The successive-difference weights in the American Community Survey
automatically use `scale = 4/ncol(repweights)`

and ```
rscales=rep(1,
ncol(repweights))
```

. This can be specified as `type="ACS"`

or
`type="successive-difference"`

. The `scale`

and `rscales`

arguments will be ignored (with a warning) if they are specified.

JK2 weights (`type="JK2"`

), as in the California Health Interview
Survey, automatically use `scale=1`

, `rscales=rep(1, ncol(repweights))`

.
The `scale`

and `rscales`

arguments will be ignored (with a warning) if they are specified.

Averaged bootstrap weights ("mean bootstrap") are used for some surveys from Statistics Canada. Yee et al (1999) describe their construction and use for one such survey.

The variance is computed as the sum of squared deviations of the
replicates from their mean. This may be rescaled: `scale`

is an
overall multiplier and `rscales`

is a vector of
replicate-specific multipliers for the squared deviations. That is,
`rscales`

should have one entry for each column of `repweights`

If thereplication weights incorporate the sampling weights
(`combined.weights=TRUE`

) or for `type="other"`

these must
be specified, otherwise they can be guessed from the weights.

A finite population correction may be specified for `type="other"`

,
`type="JK1"`

and `type="JKn"`

. `fpc`

must be a vector
with one entry for each replicate. To specify sampling fractions use
`fpctype="fraction"`

and to specify the correction directly use
`fpctype="correction"`

`repweights`

may be a character string giving a regular expression
for the replicate weight variables. For example, in the
California Health Interview Survey public-use data, the sampling weights are
`"rakedw0"`

and the replicate weights are `"rakedw1"`

to
`"rakedw80"`

. The regular expression `"rakedw[1-9]"`

matches the replicate weight variables (and not the sampling weight
variable).

`data`

may be a character string giving the name of a table or view
in a relational database that can be accessed through the `DBI`

interface. For DBI interfaces `dbtype`

should be the name of the database
driver and `dbname`

should be the name by which the driver identifies
the specific database (eg file name for SQLite).

The appropriate database interface package must already be loaded (eg
`RSQLite`

for SQLite). The survey design
object will contain the replicate weights, but actual variables will
be loaded from the database only as needed. Use
`close`

to close the database connection and
`open`

to reopen the connection, eg, after
loading a saved object.

The database interface does not attempt to modify the underlying database and so can be used with read-only permissions on the database.

To generate your own replicate weights either use
`as.svrepdesign`

on a `survey.design`

object, or see
`brrweights`

, `bootweights`

,
`jk1weights`

and `jknweights`

The `model.frame`

method extracts the observed data.

Object of class `svyrep.design`

, with methods for `print`

,

`summary`

, `weights`

, `image`

.

Levy and Lemeshow. "Sampling of Populations". Wiley.

Shao and Tu. "The Jackknife and Bootstrap." Springer.

Yee et al (1999). Bootstrat Variance Estimation for the National Population Health Survey. Proceedings of the ASA Survey Research Methodology Section. https://web.archive.org/web/20151110170959/http://www.amstat.org/sections/SRMS/Proceedings/papers/1999_136.pdf

To use replication-weight analyses on a survey specified by
sampling design, use `as.svrepdesign`

to convert it.

`as.svrepdesign`

, `svydesign`

,
`brrweights`

, `bootweights`

```
data(scd)
# use BRR replicate weights from Levy and Lemeshow
repweights<-2*cbind(c(1,0,1,0,1,0), c(1,0,0,1,0,1), c(0,1,1,0,0,1),
c(0,1,0,1,1,0))
scdrep<-svrepdesign(data=scd, type="BRR", repweights=repweights, combined.weights=FALSE)
#> Warning: No sampling weights provided: equal probability assumed
svyratio(~alive, ~arrests, scdrep)
#> Ratio estimator: svyratio.svyrep.design(~alive, ~arrests, scdrep)
#> Ratios=
#> arrests
#> alive 0.1535064
#> SEs=
#> [,1]
#> [1,] 0.009418401
if (FALSE) {
## Needs RSQLite
library(RSQLite)
db_rclus1<-svrepdesign(weights=~pw, repweights="wt[1-9]+", type="JK1", scale=(1-15/757)*14/15,
data="apiclus1rep",dbtype="SQLite", dbname=system.file("api.db",package="survey"), combined=FALSE)
svymean(~api00+api99,db_rclus1)
summary(db_rclus1)
## closing and re-opening a connection
close(db_rclus1)
db_rclus1
try(svymean(~api00+api99,db_rclus1))
db_rclus1<-open(db_rclus1)
svymean(~api00+api99,db_rclus1)
}
```