gev {VGAM} | R Documentation |
Maximum likelihood estimation of the 3-parameter generalized extreme value (GEV) distribution.
gev(llocation = "identity", lscale = "loge", lshape = "logoff", elocation = list(), escale = list(), eshape = if(lshape=="logoff") list(offset=0.5) else if(lshape=="elogit") list(min=-0.5, max=0.5) else list(), percentiles = c(95, 99), iscale=NULL, ishape = NULL, method.init = 1, gshape=c(-0.45, 0.45), tolshape0=0.001, giveWarning=TRUE, zero = 3) egev(llocation = "identity", lscale = "loge", lshape = "logoff", elocation = list(), escale = list(), eshape = if(lshape=="logoff") list(offset=0.5) else if(lshape=="elogit") list(min=-0.5, max=0.5) else list(), percentiles = c(95, 99), iscale=NULL, ishape = NULL, method.init=1, gshape=c(-0.45, 0.45), tolshape0=0.001, giveWarning=TRUE, zero = 3)
llocation, lscale, lshape |
Parameter link functions for mu, sigma and
xi respectively.
See Links for more choices.
|
elocation, escale, eshape |
List. Extra argument for the respective links.
See earg in Links for general information.
For the shape parameter,
if the logoff link is chosen then the offset is
called A below; and then the linear/additive predictor is
log(xi+A) which means that
xi > -A.
For technical reasons (see Details) it is a good idea for A=0.5.
|
percentiles |
Numeric vector of percentiles used
for the fitted values. Values should be between 0 and 100.
However, if percentiles=NULL , then the mean
mu + sigma * (gamma(1-xi)-1)/xi
is returned, and this is only defined if xi<1.
|
iscale, ishape |
Numeric. Initial value for sigma and
xi. A NULL means a value is computed internally.
The argument ishape is more important than the other two because
they are initialized from the initial xi.
If a failure to converge occurs, or even to obtain initial values occurs,
try assigning ishape some value
(positive or negative; the sign can be very important).
Also, in general, a larger value of iscale is better than a
smaller value.
|
method.init |
Initialization method. Either the value 1 or 2.
Method 1 involves choosing the best xi on a course grid with
endpoints gshape .
Method 2 is similar to the method of moments.
If both methods fail try using ishape .
|
gshape |
Numeric, of length 2.
Range of xi used for a grid search for a good initial value
for xi.
Used only if method.init equals 1.
|
tolshape0, giveWarning |
Passed into dgev when computing the log-likelihood.
|
zero |
An integer-valued vector specifying which
linear/additive predictors are modelled as intercepts only.
The values must be from the set {1,2,3} corresponding
respectively to mu, sigma, xi.
If zero=NULL then all linear/additive predictors are modelled as
a linear combination of the explanatory variables.
For many data sets having zero=3 is a good idea.
|
The GEV distribution function can be written
G(y) = exp( -[ (y- mu)/ sigma ]_{+}^{- 1/ xi})
where sigma > 0, -Inf < mu < Inf, and 1 + xi*(y-mu)/sigma > 0. Here, x_+ = max(x,0). The mu, sigma, xi are known as the location, scale and shape parameters respectively. The cases xi>0, xi<0, xi=0 correspond to the Frechet, Weibull, and Gumbel types respectively. It can be noted that the Gumbel (or Type I) distribution accommodates many commonly-used distributions such as the normal, lognormal, logistic, gamma, exponential and Weibull.
For the GEV distribution, the kth moment about the mean exists if xi < 1/k. Provided they exist, the mean and variance are given by mu + sigma { Gamma(1-xi)-1} / xi and sigma^2 { Gamma(1-2 xi) - Gamma^2 (1- xi) } / xi^2 respectively, where Gamma is the gamma function.
Smith (1985) established that when xi > -0.5,
the maximum likelihood estimators are completely regular.
To have some control over the estimated xi try
using lshape="logoff"
and the eshape=list(offset=0.5)
, say,
or lshape="elogit"
and eshape=list(min=-0.5, max=0.5)
, say.
An object of class "vglmff"
(see vglmff-class
).
The object is used by modelling functions such as vglm
,
and vgam
.
Currently, if an estimate of xi is too close to zero then
an error will occur for gev()
with multivariate responses.
In general, egev()
is more reliable than gev()
.
Fitting the GEV by maximum likelihood estimation can be numerically
fraught. If 1 + xi*(y-mu)/sigma <=
0 then some crude evasive action is taken but the estimation process
can still fail. This is particularly the case if vgam
with s
is used; then smoothing is best done with
vglm
with regression splines (bs
or ns
) because vglm
implements
half-stepsizing whereas vgam
doesn't (half-stepsizing
helps handle the problem of straying outside the parameter space.)
The VGAM family function gev
can handle a multivariate
(matrix) response. If so, each row of the matrix is sorted into
descending order and NA
s are put last.
With a vector or one-column matrix response using
egev
will give the same result but be faster and it handles
the xi=0 case.
The function gev
implements Tawn (1988) while
egev
implements Prescott and Walden (1980).
The shape parameter xi is difficult to estimate
accurately unless there is a lot of data.
Convergence is slow when xi is near -0.5.
Given many explanatory variables, it is often a good idea
to make sure zero=3
.
The range restrictions of the parameter xi are not
enforced; thus it is possible for a violation to occur.
Successful convergence often depends on having a reasonably good initial
value for xi. If failure occurs try various values for the
argument ishape
, and if there are covariates,
having zero=3
is advised.
T. W. Yee
Yee, T. W. and Stephenson, A. G. (2007) Vector generalized linear and additive extreme value models. Extremes, 10, 1–19.
Tawn, J. A. (1988) An extreme-value theory model for dependent observations. Journal of Hydrology, 101, 227–250.
Prescott, P. and Walden, A. T. (1980) Maximum likelihood estimation of the parameters of the generalized extreme-value distribution. Biometrika, 67, 723–724.
Smith, R. L. (1985) Maximum likelihood estimation in a class of nonregular cases. Biometrika, 72, 67–90.
rgev
,
gumbel
,
egumbel
,
guplot
,
rlplot.egev
,
gpd
,
elogit
,
oxtemp
,
venice
.
# Multivariate example y = as.matrix(venice[,paste("r", 1:10, sep="")]) fit1 = vgam(y[,1:2] ~ s(year, df=3), gev(zero=2:3), venice, trace=TRUE) coef(fit1, matrix=TRUE) head(fitted(fit1)) ## Not run: par(mfrow=c(1,2), las=1) plot(fit1, se=TRUE, lcol="blue", scol="forestgreen", main="Fitted mu(year) function (centered)", cex.main=0.8) with(venice, matplot(year, y[,1:2], ylab="Sea level (cm)", col=1:2, main="Highest 2 annual sea levels", cex.main=0.8)) with(venice, lines(year, fitted(fit1)[,1], lty="dashed", col="blue")) legend("topleft", lty="dashed", col="blue", "Fitted 95 percentile") ## End(Not run) # Univariate example (fit = vglm(maxtemp ~ 1, egev, data=oxtemp, trace=TRUE)) head(fitted(fit)) coef(fit, mat=TRUE) Coef(fit) vcov(fit) vcov(fit, untransform=TRUE) sqrt(diag(vcov(fit))) # Approximate standard errors ## Not run: rlplot(fit)