Usage
gofGroupTest(object, ...)
"gofGroupTest"(object, data = NULL, subset, na.action = na.pass, ...)
"gofGroupTest"(object, group, test = "sw", distribution = "norm", est.arg.list = NULL, n.classes = NULL, cut.points = NULL, param.list = NULL, estimate.params = ifelse(is.null(param.list), TRUE, FALSE), n.param.est = NULL, correct = NULL, digits = .Options$digits, exact = NULL, ws.method = "normal scores", data.name = NULL, group.name = NULL, parent.of.data = NULL, subset.expression = NULL, ...)
"gofGroupTest"(object, ...)
"gofGroupTest"(object, ...)
"gofGroupTest"(object, ...)
Arguments
object
an object containing data for 2 or more groups to be compared to the
hypothesized distribution specified by distribution
. In the default method,
the argument object
must be a numeric vector.
When object
is a data frame, all columns must be numeric.
When object
is a matrix, it must be a numeric matrix.
When object
is a list, all components must be numeric vectors.
In the formula method, a symbolic specification of the form y ~ g
can be given, indicating the observations in the vector y
are to be grouped
according to the levels of the factor g
. Missing (NA
), undefined (NaN
),
and infinite (Inf
, -Inf
) values are allowed but will be removed.
data
when object
is a formula, data
specifies an optional data frame, list or
environment (or object coercible by as.data.frame
to a data frame) containing the
variables in the model. If not found in data
, the variables are taken from
environment(formula)
, typically the environment from which
summaryStats
is called.
subset
when object
is a formula, subset
specifies an optional vector specifying
a subset of observations to be used.
na.action
when object
is a formula, na.action
specifies a function which indicates
what should happen when the data contain NA
s. The default is na.pass
.
group
when object
is a numeric vector, group
is a factor or character vector
indicating which group each observation belongs to. When object
is a matrix or data frame
this argument is ignored and the columns define the groups. When object
is a list
this argument is ignored and the components define the groups. When object
is a formula,
this argument is ignored and the right-hand side of the formula specifies the grouping variable.
test
character string defining which goodness-of-fit test to perform on each group.
Possible values are:
"sw"
(Shapiro-Wilk; the default), "sf"
(Shapiro-Francia),
"ppcc"
(Probability Plot Correlation Coefficient), "skew"
(Zero-skew),
"chisq"
(Chi-squared), "ks"
(Kolmogorov-Smirnov), and
"ws"
(Wilk-Shapiro test for Uniform [0, 1] distribution).
distribution
a character string denoting the distribution abbreviation. See the help file for
Distribution.df
for a list of distributions and their abbreviations.
The default value is distribution="norm"
(Normal distribution).
When test="sw"
, test="sf"
, or test="ppcc"
, any continuous
distribuiton is allowed (e.g., "norm"
(normal), "lnorm"
(lognormal),
"gamma"
(gamma), etc.), as well as mixed distributions involving the normal distribution
(i.e., "zmnorm"
(zero-modified normal), "zmlnorm"
(zero-modified lognormal (delta)),
and
"zmlnorm.alt"
(zero-modified lognormal with alternative parameterization)). When test="skew"
, only the values "norm"
(normal), "lnorm"
(lognormal),
"lnorm.alt"
(lognormal with alternative parameterization),
"zmnorm"
(zero-modified normal), "zmlnorm"
(zero-modified lognormal (delta)), and
"zmlnorm.alt"
(zero-modified lognormal with alternative parameterization) are allowed.
When test="ks"
, any continuous distribution is allowed.
When test="chisq"
, any distribuiton is allowed.
When test="ws"
, this argument is ignored.
est.arg.list
a list of arguments to be passed to the function estimating the distribution parameters
for each group of observations.
For example, if test="sw"
and
distribution="gamma"
, setting est.arg.list=list(method="bcmle")
indicates using the bias-corrected maximum-likelihood estimators of shape and scale
(see the help file for egamma
. See the help file
Estimating Distribution Parameters for a list of estimating functions.
The default value is
est.arg.list=NULL
so that all
default values for the estimating function are used. This argument is ignored if
estimate.params=FALSE
. When test="sw"
, test="sf"
, test="ppcc"
, or test="skew"
,
and you are testing for some form of normality (i.e., Normal, Lognormal,
Three-Parameter Lognormal,
Zero-Modified Normal, or
Zero-Modified Lognormal (Delta)),
the estimated parameters are provided in the
output merely for information, and the choice of the method of estimation has no effect
on the goodness-of-fit test statistics or p-values.
When test="ks"
, and estimate.params=TRUE
,
the estimated parameters are used to specify the null hypothesis of which distribution
the data are assumed to come from.
When test="chisq"
and estimate.params=TRUE
,
the estimated parameters are used to specify the null hypothesis of which distribution
the data are assumed to come from.
When test="ws"
, this argument is ignored.
n.classes
for the case when test="chisq"
, the number of cells into which the observations
within each group are to be allocated. If the argument cut.points
is supplied,
then n.classes
is set to length(cut.points)-1
. The default value is
ceiling(2* (length(x)^(2/5)))
and is recommended by Moore (1986).
cut.points
for the case when test="chisq"
, a vector of cutpoints that defines the cells for each
group of observations.
The element x[i]
is allocated to cell j
if
cut.points[j]
< x[i]
$\le$ cut.points[j+1]
. If x[i]
is less than or equal to the first cutpoint or
greater than the last cutpoint, then x[i]
is treated as missing. If the
hypothesized distribution is discrete, cut.points
must be supplied. The default
value is cut.points=NULL
, in which case the cutpoints are determined by
n.classes
equi-probable intervals.
param.list
for the case when test="ks"
or test="chisq"
,
a list with values for the parameters of the specified distribution. See the help file
for Distribution.df
for the names and possible values of the parameters
associated with each distribution. The default value is NULL
, which forces
estimation of the distribution parameters. This argument is ignored if
estimate.params=TRUE
.
estimate.params
for the case when test="ks"
or test="chisq"
,
a logical scalar indicating whether to perform the goodness-of-fit test based on
estimating the distribution parameters (estimate.params=TRUE
) or using the
user-supplied distribution parameters specified by param.list
(estimate.params=FALSE
). The default value of estimate.params
is TRUE
if
param.list=NULL
, otherwise it is FALSE
.
n.param.est
for the case when test="ks"
or test="chisq"
,
an integer indicating the number of parameters estimated from the data.
If estimate.params=TRUE
, the default value is the number of parameters associated
with the distribution specified by distribution
(e.g., 2 for a normal distribution).
If estimate.params=FALSE
, the default value is n.param.est=0
.
correct
for the case when test="chisq"
, a logical scalar indicating whether to use the
continuity correction. The default value is correct=FALSE
unless
n.classes=2
.
digits
a scalar indicating how many significant digits to print out for the parameters
associated with the hypothesized distribution. The default value is
.Options$digits
.
exact
for the case when test="ks"
, exact=NULL
by default, but can be set to
a logical scalar indicating whether an exact p-value should be computed.
See the help file for ks.test
for more information.
ws.method
character string indicating which method to use when performing the
Wilk-Shapiro test for a Uniform [0,1] distribution
on the p-values from the goodness-of-fit tests on each group. Possible values
are ws.method="normal scores"
(the default) or
ws.method="chi-square scores"
. See the subsection
Wilk-Shapiro goodness-of-fit test for Uniform [0, 1] distribution
under the DETAILS section of the help file for gofTest
for more information. NOTE: In the case where you are testing whether each group comes from a
Uniform [0,1] distribution (i.e., when you set
test="ws"
), the argument ws.method
determines which score types
are used for each individual test of the groups as well.
data.name
character string indicating the name of the data used for the goodness-of-fit tests.
The default value is data.name=deparse(substitute(object))
.
group.name
character string indicating the name of the data used to create the groups.
The default value is group.name=deparse(substitute(group))
.
parent.of.data
character string indicating the source of the data used for the goodness-of-fit tests.
subset.expression
character string indicating the expression used to subset the data.
...
additional arguments affecting the goodness-of-fit test.