betabinomial {VGAM} | R Documentation |
Fits a beta-binomial distribution by maximum likelihood estimation. The two parameters here are the mean and correlation coefficient.
betabinomial(lmu="logit", lrho="logit", emu=list(), erho=list(), irho=NULL, method.init=1, shrinkage.init=0.95, nsimEIM=NULL, zero=2)
lmu, lrho |
Link functions applied to the two parameters.
See Links for more choices.
The defaults ensure the parameters remain in (0,1),
however, see the warning below.
|
emu, erho |
List. Extra argument for each of the links.
See earg in Links for general information.
|
irho |
Optional initial value for the correlation parameter.
If given, it must be in (0,1), and is recyled to the necessary
length. Assign this argument a value if a convergence failure occurs.
Having irho=NULL means an initial value is obtained internally,
though this can give unsatisfactory results.
|
method.init |
An integer with value 1 or 2 or ...,
which specifies the initialization method for mu.
If failure to converge occurs try the another value
and/or else specify a value for irho .
|
zero |
An integer specifying which
linear/additive predictor is to be modelled as an intercept only.
If assigned, the single value should be either 1 or 2 .
The default is to have a single correlation parameter.
To model both parameters as functions of the covariates assign
zero=NULL .
See CommonVGAMffArguments for more information.
|
shrinkage.init, nsimEIM |
See CommonVGAMffArguments for more information.
The argument shrinkage.init is used only if method.init=2 .
Using the argument nsimEIM may offer large advantages for large
values of N and/or large data sets.
|
There are several parameterizations of the beta-binomial distribution.
This family function directly models the mean and correlation
parameter, i.e.,
the probability of success.
The model can be written
T|P=p ~ Binomial(N,p)
where P has a beta distribution with shape parameters
alpha and beta. Here,
N is the number of trials (e.g., litter size),
T=NY is the number of successes, and
p is the probability of a success (e.g., a malformation).
That is, Y is the proportion of successes. Like
binomialff
, the fitted values are the
estimated probability
of success (i.e., E[Y] and not E[T])
and the prior weights N are attached separately on the
object in a slot.
The probability function is
P(T=t) = choose(N,t) B(alpha+t, beta+N-t) / B(alpha, beta)
where t=0,1,...,N, and B is the
beta
function
with shape parameters alpha and beta.
Recall Y = T/N is the real response being modelled.
The default model is eta1 =logit(mu) and eta2 = logit(rho) because both parameters lie between 0 and 1. The mean (of Y) is p = mu = alpha / (alpha + beta) and the variance (of Y) is mu(1-mu)(1+(N-1)rho)/N. Here, the correlation rho is given by 1/(1 + alpha + beta) and is the correlation between the N individuals within a litter. A litter effect is typically reflected by a positive value of rho. It is known as the over-dispersion parameter.
This family function uses Fisher scoring. Elements of the second-order expected derivatives with respect to alpha and beta are computed numerically, which may fail for large alpha, beta, N or else take a long time.
An object of class "vglmff"
(see vglmff-class
).
The object is used by modelling functions such as vglm
.
Suppose fit
is a fitted beta-binomial model. Then
fit@y
contains the sample proportions y,
fitted(fit)
returns estimates of E(Y), and
weights(fit, type="prior")
returns the number
of trials N.
If the estimated rho parameter is close to zero then it pays to try
lrho="rhobit"
. One day this may become the default link function.
This family function is prone to numerical difficulties
due to the expected information matrices not being positive-definite
or ill-conditioned over some regions of the parameter space.
If problems occur try setting irho
to some numerical value,
nsimEIM=100
, say,
or else use etastart
argument of
vglm
, etc.
This function processes the input in the same way
as binomialff
. But it does not handle
the case N=1 very well because there are two
parameters to estimate, not one, for each row of the input.
Cases where N=1 can be omitted via the
subset
argument of vglm
.
The extended beta-binomial distribution of Prentice (1986)
is currently not implemented in the VGAM package as it has
range-restrictions for the correlation parameter that are currently
too difficult to handle in this package.
However, try lrho="rhobit"
.
T. W. Yee
Moore, D. F. and Tsiatis, A. (1991) Robust estimation of the variance in moment methods for extra-binomial and extra-Poisson variation. Biometrics, 47, 383–401.
Prentice, R. L. (1986) Binary regression using an extended beta-binomial distribution, with discussion of correlation induced by covariate measurement errors. Journal of the American Statistical Association, 81, 321–327.
betabin.ab
,
Betabin
,
binomialff
,
betaff
,
dirmultinomial
,
lirat
.
# Example 1 N = 10; mu = 0.5; rho = 0.8 y = rbetabin(n=100, size=N, prob=mu, rho=rho) fit = vglm(cbind(y,N-y) ~ 1, betabinomial, trace=TRUE) coef(fit, matrix=TRUE) Coef(fit) head(cbind(fit@y, weights(fit, type="prior"))) # Example 2 fit = vglm(cbind(R,N-R) ~ 1, betabinomial, data=lirat, trace=TRUE, subset=N>1) coef(fit, matrix=TRUE) Coef(fit) t(fitted(fit)) t(fit@y) t(weights(fit, type="prior")) # Example 3, which is more complicated lirat = transform(lirat, fgrp = factor(grp)) summary(lirat) # Only 5 litters in group 3 fit2 = vglm(cbind(R,N-R) ~ fgrp + hb, betabinomial(zero=2), data=lirat, trace=TRUE, subset=N>1) coef(fit2, matrix=TRUE) ## Not run: with(lirat, plot(hb[N>1], fit2@misc$rho, xlab="Hemoglobin", ylab="Estimated rho", pch=as.character(grp[N>1]), col=grp[N>1])) ## End(Not run) ## Not run: # cf. Figure 3 of Moore and Tsiatis (1991) with(lirat, plot(hb, R/N, pch=as.character(grp), col=grp, las=1, xlab="Hemoglobin level", ylab="Proportion Dead", main="Fitted values (lines)")) smalldf = with(lirat, lirat[N>1,]) for(gp in 1:4) { xx = with(smalldf, hb[grp==gp]) yy = with(smalldf, fitted(fit2)[grp==gp]) o = order(xx) lines(xx[o], yy[o], col=gp) } ## End(Not run)