Maximum pseudolikelihood estimation in complex surveys

Maximises a user-specified likelihood parametrised by multiple linear predictors to data from a complex sample survey and computes the sandwich variance estimator of the coefficients. Note that this function maximises an estimated population likelihood, it is not the sample MLE.

svymle(loglike, gradient = NULL, design, formulas, start = NULL, control
= list(), na.action="na.fail", method=NULL, lower=NULL,upper=NULL,influence=FALSE,...)
# S3 method for svymle
summary(object, stderr=c("robust", "model"),...)

Arguments

loglike: vectorised loglikelihood function
gradient: Derivative of loglike. Required for variance computation and helpful for fitting
design: a survey.design object
formulas: A list of formulas specifying the variable and linear predictors: see Details below
start: Starting values for parameters
control: control options for the optimiser: see the help page for the optimiser you are using.
lower,upper: Parameter bounds for bobyqa
influence: Return the influence functions (primarily for svyby)
na.action: Handling of NAs
method: "nlm" to use nlm,"uobyqa" or "bobyqa" to use those optimisers from the minqa package; otherwise passed to optim
...: Arguments to loglike and gradient that are not to be optimised over.
object: svymle object
stderr: Choice of standard error estimator. The default is a standard sandwich estimator. See Details below.

Details

Optimization is done by nlm by default or if method=="nlm". Otherwise optim is used and method specifies the method and control specifies control parameters.

The design object contains all the data and design information from the survey, so all the formulas refer to variables in this object. The formulas argument needs to specify the response variable and a linear predictor for each freely varying argument of loglike.

Consider for example the dnorm function, with arguments x, mean, sd and log, and suppose we want to estimate the mean of y as a linear function of a variable z, and to estimate a constant standard deviation. The log argument must be fixed at FALSE to get the loglikelihood. A formulas argument would be list(~y, mean=~z, sd=~1). Note that the data variable y must be the first argument to dnorm and the first formula and that all the other formulas are labelled. It is also permitted to have the data variable as the left-hand side of one of the formulas: eg list( mean=y~z, sd=~1).

The two optimisers from the minqa package do not use any derivatives to be specified for optimisation, but they do assume that the function is smooth enough for a quadratic approximation, ie, that two derivatives exist.

The usual variance estimator for MLEs in a survey sample is a `sandwich' variance that requires the score vector and the information matrix. It requires only sampling assumptions to be valid (though some model assumptions are required for it to be useful). This is the stderr="robust" option, which is available only when the gradient argument was specified.

If the model is correctly specified and the sampling is at random conditional on variables in the model then standard errors based on just the information matrix will be approximately valid. In particular, for independent sampling where weights and strata depend on variables in the model the stderr="model" should work fairly well.

Value

An object of class svymle

Author

Thomas Lumley

Examples