nls2
To Estimate the parameters of the Regression Model
|
---|
DESCRIPTION:
Estimation of the parameters of a non linear regression model
over a given set of observations.
The regression function can be defined explicitly
as a function of independent variables and of unknown parameters,
or it can be defined as the solution of a differential equations system.
Heteroscedasticity of errors can be taken into account by modelling the
variance function.
The model should be described in an operating-system file.
USAGE:
nls2(data, model, stat.ctx,
integ.ctx=NULL,
method=NULL, control=NULL,
renls2=FALSE)
REQUIRED ARGUMENTS:
- data
-
the data frame that contains the independent variables
and the response.
Missing values are not allowed.
"data" can also include a vector called "weights" that contains
positive or null weighting values
- a null weight implies the suppression of the corresponding observation -
and a character vector called "curves" that contains
curve identifiers.
The response and the independent variables are identified by examining
the model description-file.
Any other vectors included in data are ignored.
The observations must be sorted, first on curves and second,
on replications, if any.
- model
-
a list that specifies the characteristics of the model to be studied
with at least the component
"file", a character string equal to the pathname
of the operating-system file that contains the description of the model
(see `nls2.model').
"model" can also include the following components:
vari.type, gamf, gamv, eqp.theta, eq.theta, inf.theta, sup.theta
and eqp.beta, eq.beta, inf.beta, sup.beta.
See the paragraph `MODEL' DESCRIPTION.
- stat.ctx
-
a list that describes the statistical context
with at least the components:
- theta.start:
-
a vector containing
the starting values of the regression parameters.
- beta.start:
-
a vector containing
the starting values of the parameters that appear
in the variance function only, when there are such parameters.
The length of these vectors must be equal to the
number of multiple parameters and their values sorted according to
the declarations of the model description-file.
If there are any curves, the vectors must be sorted
on curves first.
"stat.ctx" can also include the following components:
algorithm,
sigma2.type, sigma2,
mu.type, mu3, mu4,
max.err.c1, max.err.c2, max.iters,
omega.c1, omega.c2,
lambda.start, lambda.c1, lambda.c2, max.lambda,
max.stop.crit,
nameN, family.
See the paragraph `STAT.CTX' DESCRIPTION.
- integ.ctx
-
a list of characteristics required when
the model is described
by a differential equations system
with at least the components:
- "start": the initial value of the
integration variable (i.e the variable
introduced by the `varint' declaration in the model description-file);
this variable will be referred to as "time" in this section.
"start" is:
. a scalar if the initial value of time does not depend on
independent variables nor on the number of curves.
. a vector if the initial value of time is not the same for
all curves and if the independent variables do not appear in the equations.
The length of "start" is then equal to the number of curves;
. a vector if the independent variables appear in the equations.
The length of "start" is then equal to the number of observations,
including replications and null-weighted observations, if any.
- "nb.theta.odes": the number of parameters of the odes to be estimated.
"nb.theta.odes" is required only when the model is evaluated by programs
(see `loadnls2').
"integ.ctx" can also include the following components:
cond.start, integ.values,
print, itol, atol, rtol, iopt, ropt.
See the help-file `nls2.integ.ctx'.
OPTIONAL ARGUMENTS:
- method
-
the requested method of estimation.
See the paragraph `METHOD' DESCRIPTION.
- control
-
a list of values for controlling which
intermediate results are printed or saved and when.
Give also the
possibility of stopping the execution when too many warnings appear.
See the help file for `nls2.control'.
- renls2
-
a logic that should be TRUE if the current call to "nls2"
is the first one of a series (see functions `bootstrap.nls2',
`renls2', `rexnls2').
VALUE:
An object of class "nls2" is returned: see `nls2.object'.
The function `loadnls2' should have been previously called
to load into the S-session all the programs necessary
for execution.
SIDE EFFECTS:
If the model is not evaluated by a program
(see `loadnls2'),
an operating-system file is created which contains the C-source programs
that correspond to the formal description of the model.
If this file already exists, it is replaced.
MODEL DESCRIPTION:
In addition to the component "file", the argument `model' can include:
- vari.type:
-
the variance type. Valid values are:
"CST": the variance is constant.
"SW": the variance is weighted.
"VST": the variance depends only on the parameters that appear
in the regression function
and is known up to square sigma.
"VB": the variance depends only on the parameters
of the variance function that do not appear in the regression function.
"VSB": the variance depends only on the parameters of the variance function
that do not appear in the regression function
and is known up to square sigma.
"VSTB": the variance depends on the parameters of the regression function
and on the parameters of the variance
function and is known up to from square sigma.
"VTB": the variance depends on the parameters of the regression function
and on the parameters of the variance function.
"VI": case of an experimental design with replications in which
the variances are estimated by the empirical variances.
The default is determined
according to the values of the other inputs.
- gamf:
-
the values of the second level parameters
of the regression (i.e parameters that will not be estimated)
sorted according to the `pbisresp' declaration
of the model description-file.
- gamv:
-
the values of the second level parameters
of the variance function sorted according to the
`pbisvar' declaration of the model description-file.
- eqp.theta:
-
equality constraints between the regression parameters.
An ordered vector whose values are the
successive integers from 1
to the total number of different parameters:
elements that correspond to equal parameters
should be set to the same value.
- eq.theta:
-
equality numerical constraints on
the regression parameters.
- inf.theta:
-
lower bound numerical constraints on
the regression parameters.
- sup.theta:
-
upper bound numerical constraints on
the regression parameters.
Elements inside the vectors of constraints
are sorted according to the `parvar' declaration
of the model-description file.
If there are any curves, they must be sorted
on curves first.
Elements in the vectors of numerical constraints
that correspond
to non-constrained parameters should be set to NaN.
The corresponding components for
the constraints on the parameters that appear in the variance function only,
are: "eqp.beta,eq.beta,inf.beta,sup.beta".
In addition to the component `theta.start' and the component `beta.start',
the argument `stat.ctx' can contain the following components:
- max.iters:
-
the maximum number of iterations.
If null, no estimation is achieved and the outputs are calculated using
the starting values of the parameters.
The default setting is ten times the number of multiple parameters.
- max.stop.crit:
-
the upper bound of the stopping criterion.
Default value: 10e-8
- algorithm:
-
a character string that specifies the algorithm requested:
"GM" if Gauss-Marquardt, "GN" if Gauss-Newton.
Default value: "GM"
- sigma2.type:
-
a character string that specifies the way sigma2 should be estimated.
The valid values are
"KNOWN" (the value of square sigma is known and provided in the "sigma2" component),
"VARREP" (square sigma is estimated by the intra-replications variance),
"VARRESID" (square sigma is estimated by the residual variance),
"IGNORED" (sigma does not play any role in the variance estimation),
"VARINTRA" (case of an experimental design with replications in which
sigma does not appear).
The default is determined
according to the variance type.
- sigma2:
-
the square value of sigma when it is known.
- mu.type:
-
a character string that specifies the type of the 3rd and 4th order moments.
The valid values are
"KNOWN" (the values of the moments are known and provided in the
"mu3" and "mu4" components),
"MUGAUSS" (the moments are the moments of a gaussian),
"MURES" (the error variance is assumed to be constant and the moments
are estimated from the residuals),
"MURESREP" (the moments
are estimated from the residuals on the replications).
Default setting: "MUGAUSS".
- mu3,mu4:
-
the values of the 3rd and 4th order moments when they are known.
Vectors of length equal to the number of observations, not including
replications if any.
"mu3" and "mu4" should be sorted as the observations in the data frame.
- lambda.start:
-
the initial value of the Gauss-Marquardt parameter.
Default value: 5
- lambda.c1:
-
the value by which the Gauss-Marquardt parameter
is multiplied when calculating the optimal step
if the criterion is minimum at the center of the current interval.
Default value: 0.1
- lambda.c2:
-
the value by which the Gauss-Marquardt parameter
is multiplied when calculating the optimal step
if the criterion is minimum at the beginning of the current interval.
Default value: 100/lambda.c1
- max.lambda:
-
the upper bound of the Gauss-Marquardt parameter.
Default value: 10e-6
- max.err.c1:
-
the maximum number of times
the direction should be modified when the model cannot be calculated.
Default value: 5
- max.err.c2:
-
the maximum number of times a correction should be
made when the optimal step is calculated.
Default value: 5
- omega.c1:
-
the correction of
the direction when the model cannot be calculated.
Default value: 0.5
- omega.c2:
-
the correction of
the direction when calculating the optimal step
if the criterion is minimum at the beginning point of the interval
and if Gauss_Newton algorithm is used.
Default value: 0.01
- family:
-
a character string among
"gaussian","poisson","binomial", "bernoulli", "multinomial",
to specify the family of the model.
Default value: "gaussian".
Careful: this is here a character string and not an
'Family object' as for 'glm'.
- nameN:
-
when the family of the model is "binomial"
or "multinomial",
a character string to specify the name of the independant
variable which contains the counts.
Should be one of the variables declared by varind
in the model description-file.
METHOD DESCRIPTION:
The valid values for the argument `method' (the estimation method) are:
"MLTB":
the parameters of the regression and variance functions
should be estimated by the maximum likelihood method.
"MLSTB":
the parameters of the regression and variance functions
should be estimated by the
modified least squares method.
"QLTB":
the parameters of the regression function and variance functions
should be estimated by the
quasi-likelihood method.
"MLT":
the parameters of the regression function should be estimated by the
maximum likelihood method.
"WLST":
the parameters of the regression function should be estimated by the
weighted least squares method.
"OLST":
the parameters of the regression function should be estimated by the
ordinary least squares method.
"MLST":
the parameters of the regression function should be estimated by the
modified least squares method.
"QLT":
the parameters of the regression function should be estimated by the
quasi-likelihood method.
"VITWLS":
the parameters of the regression function should be estimated by the
intra-variance weighted least squares method.
"OLSB":
the parameters of the variance function should be estimated by the
ordinary least squares method.
"MLSB":
the parameters of the variance function should be estimated by the
modified least squares method.
"QLB":
the parameters of the variance function should be estimated by the
quasi-likelihood method.
"MYOWN":
the method is provided by the user (for expert statisticians only):
see `nls2.mymethod'.
In alternated estimation, the "method"
argument is required and should be a vector of length
equal to the number of steps requested (<4).
In simultaneous estimation, it is optional.
The default setting is determined
according to the values of the other inputs.
When the argument `model' contains the component "file" only,
it can be reduced to a character string.
When the argument `stat.ctx' contains the component `theta.start' only,
it can be reduced to a vector or a scalar structure.
The effective values of the argument `method', the argument `stat.ctx'
and the component `vari.type' of the argument `model'
are returned in the output.
-
References
to nls2
-
Other references:
-
ODEPACK: A systematized collection of ODE solvers,
Hindmarsh A.C. (1983). Scientific Computing, R. S. Stepleman et al. (eds.),
North Holland, Amsterdam, 1983, pp. 55-64.
-
Automatic selection of methods for solving stiff and
nonstiff systems of ordinary differential equations,
Linda R. Petzold, SIAM J. SCI. STAT. COMPUT. 4 (1983), pp. 136-148.
SEE ALSO:
- Explanations about the inputs in the help-files:
`nls2.model', `nls2.control', `nls2.integ.ctx'
- Explanations about the output in the help-file:
`nls2.object'.
- Explanations about how to write the programs that evaluate
the model manually in the help-file:
`nls2.mymodel'.
- To load the necessary programs into the S-session:
`loadnls2'.
- To examine the output:
`print.nls2' (you can just use "print"), and
`summary.nls2' (or just "summary").
- To compare:
`all.equal.nls2' (or just "all.equal")
- To plot:
`plvar.nls2', `plres.nls2', `plfit.nls2' and `plit.nls2'
(you can just use "plvar", "plres", "plfit", "plit").
The data can be plotted by using function `pldnls2'.
- To extract components:
`fitted.nls2', `coef.nls2', `residuals.nls2' and `summary.nls2'
(or just "fitted", "coef", "residuals" and "summary")
- To evaluate the model on given input values: `calcmodnls2'.
- To process successive estimations:
`bootstrap.nls2', `renls2' and `rexnls2'
(if installed on the current site).
- To calculate confidence intervals, regions and ellipsoids:
`confidence.nls2' (or just "confidence"),
`ellips.nls2' ("ellips") and `iso.nls2' ("iso")
(if installed on the current site).
- To study functions of the estimated parameters:
`calcpsinls2', `wald.nls2' ("wald"), `confidence.nls2' ("confidence")
and `ellips.nls2' ("ellips") (if installed on the current site)
- To calculate the inverse of the regression function and
make calibration:
`calcinvnls2', `calib.nls2', `plot.calibnls2'
(if installed on the current site).
Example 1: model Weibull.
The model is described in the following file,
called "weibull":
resp y;
var v;
varind x;
parresp p1,p2,p3,p4;
subroutine;
begin
y= p1 - p2 * exp( - p3 * exp(log(x)*p4));
v=1;
end
The S commands are:
# create data:
x <- c(9, 14, 21, 28, 42, 57, 63, 70, 79)
y <- c(8.93, 10.80, 18.59, 22.33, 39.35, 56.11,
61.73, 64.62, 67.08)
data <-data.frame(x,y)
# load the programs required by nls2:
loadnls2()
# execute the estimation process:
nls2.out<- nls2(data,"weibull",
c(70,62,0.0001,2.5))
Example 2: the parameters of the regression and variance functions
are estimated alternatively.
The model is described in the following file, called "cortisol":
resp f;
var v;
varind x;
parresp n,d,a,b,g;
parvar h;
aux a1;
pbisresp minf,pinf;
subroutine;
begin
a1 = 1+exp(a+b*x);
f = if x==minf then
d
else
if x==pinf then
n
else
n+(d-n)*exp(-g*log(a1))
fi
fi;
v=exp(h*log(f));
end
The S commands are:
# create data:
x <- matrix(c(
-5.0,-5.0,-5.0,-5.0,-5.0,-5.0,-5.0,-5.0,
-1.699,-1.699,-1.699,-1.699,-1.398,-1.398,-1.398,-1.398,
-1.222,-1.222,-1.222,-1.222,-1.097,-1.097,-1.097,-1.097,
-1.000,-1.000,-1.000,-1.000,-0.699,-0.699,-0.699,-0.699,
-0.398,-0.398,-0.398,-0.398,-0.222,-0.222,-0.222,-0.222,
-0.097,-0.097,-0.097,-0.097,0.000,0.000,0.000,0.000,
0.176,0.176,0.176,0.176,0.301,0.301,0.301,0.301,
0.602,0.602,0.602,0.602,5.000,5.000,5.000,5.000),
ncol=1,byrow=T)
f <- c(
2868,2785,2849,2805,2779,2588,2701,2752,
2615,2651,2506,2498,2474,2573,2378,2494,
2152,2307,2101,2216,2114,2052,2016,2030,
1862,1935,1800,1871,1364,1412,1377,1304,
910,919,855,875,702,701,689,696,
586,596,561,562,501,495,478,493,
392,358,399,394,330,351,343,333,
250,261,244,242,131,135,134, 133)
data <-data.frame(x,f)
# create model:
model <- list(file="cortisol", gamf=c(-5,5))
# create context:
ctx<- list(theta.start=c(130, 3000, 0,1,1),
beta.start=c(2.3), max.iters=100)
# create control:
# The intermediate results kept by default are saved
# every 5 iterations for each of the steps. Therefore,
# the intermediate results printed by default will
# appear automatically with the same frequency.
crole<- nls2.control(freq=5, step.iters.sv=c(1,2,3))
# load the programs required by nls2:
loadnls2()
# execute the estimation process:
nls2.out <- nls2(data, model, ctx,
method=c("OLST","MLSB","MLST"), control = crole)
Example 3: a model described by an odes.
The model is described in the following file, called "volterra":
resp f;
varind x, ind;
parresp H0, P0, r1,s1,r2,s2;
varint t;
valint x;
condinit H0, P0;
F H, P;
dF dH, dP;
aux a;
subroutine;
begin
dH = (r1-s1*P)*H;
dP = (r2-s2*(P/H))*P;
a = if ind==1 then H[x] else
P[x]
fi;
f = a;
end
The S commands are:
# create data (there are 2 independent variables:
# "x" and "ind"):
matx<-matrix(
c(0, 1, 1,1, 2,1, 3,1, 4,1, 5,1, 6,1, 7,1, 8,1, 9,1, 10,1,
11,1, 12,1, 13,1, 14,1, 15,1, 16,1, 17,1, 18,1, 19,1,
20,1, 21,1, 22,1, 23,1, 24,1, 25,1, 1,2, 15,2, 25,2),
byrow=T, ncol=2, dimnames=list(NULL,c("x","ind")))
valy<- c(39,22,24,25,27,26,24,24,23,23,23,22,26,24,27,
25,25, 25,26,24,25,23,26,26,26,20,11,10,0)
data<-data.frame(matx,f=valy)
# create context: context is reduced to the component always
# required, i.e the starting values of the regression
# parameters.
ctx<-c(40.0, 30.0, 1.0, 0.1, 1.0, 2.5)
# create integration context:
ctxi<-list(start=0, nb.theta.odes=4)
# create control: all intermediate printings are suppressed.
control <- list(freq=0)
# load the programs required by nls2:
loadnls2()
# execute the estimation process:
nls2.out <- nls2(data,"volterra",ctx, integ.ctx=ctxi,
control=control)
Last release: Nov 17 1997 -