rmsMisc: Miscellaneous Design Attributes and Utility Functions

Description

These functions are used internally to anova.rms, fastbw, etc., to retrieve various attributes of a design. These functions allow some fitting functions not in the rms series (e.g,, lm, glm) to be used with rms.Design, fastbw, and similar functions.

For vcov, there are several functions. The method for orm fits is a bit different because the covariance matrix stored in the fit object only deals with the middle intercept. See the intercepts argument for more options. There is a method for lrm that also allows non-default intercept(s) to be selected (default is first).

The oos.loglik function for each type of model implemented computes the -2 log likelihood for out-of-sample data (i.e., data not necessarily used to fit the model) evaluated at the parameter estimates from a model fit. Vectors for the model's linear predictors and response variable must be given. oos.loglik is used primarily by bootcov.

The Getlim function retrieves distribution summaries from the fit or from a datadist object. It handles getting summaries from both sources to fill in characteristics for variables that were not defined during the model fit. Getlimi returns the summary for an individual model variable.

The related.predictors function returns a list containing variable numbers that are directly or indirectly related to each predictor. The interactions.containing function returns indexes of interaction effects containing a given predictor. The param.order function returns a vector of logical indicators for whether parameters are associated with certain types of effects (nonlinear, interaction, nonlinear interaction). combineRelatedPredictors creates of list of inter-connected main effects and interations for use with predictrms with type='ccterms' (useful for gIndex).

The Penalty.matrix function builds a default penalty matrix for non-intercept term(s) for use in penalized maximum likelihood estimation. The Penalty.setup function takes a constant or list describing penalty factors for each type of term in the model and generates the proper vector of penalty multipliers for the current model.

logLik.rms returns the maximized log likelihood for the model, whereas AIC.rms returns the AIC. The latter function has an optional argument for computing AIC on a "chi-square" scale (model likelihood ratio chi-square minus twice the regression degrees of freedom. logLik.ols handles the case for ols, just by invoking logLik.lm in the stats package. logLik.Gls is also defined.

nobs.rms returns the number of observations used in the fit.

The lrtest function does likelihood ratio tests for two nested models, from fits that have stats components with "Model L.R." values. For models such as psm, survreg, ols, lm which have scale parameters, it is assumed that scale parameter for the smaller model is fixed at the estimate from the larger model (see the example).

univarLR takes a multivariable model fit object from rms and re-fits a sequence of models containing one predictor at a time. It prints a table of likelihood ratio $chi^2$ statistics from these fits.

The Newlabels function is used to override the variable labels in a fit object. Likewise, Newlevels can be used to create a new fit object with levels of categorical predictors changed. These two functions are especially useful when constructing nomograms.

rmsArgs handles ... arguments to functions such as Predict, summary.rms, nomogram so that variables to vary may be specified without values (after an equals sign).

prModFit is the workhorse for the print methods for highest-level rms model fitting functions, handling both regular and LaTeX printing, the latter resulting in LaTeX code written to the terminal, automatically ready for Sweave. The work of printing summary statistics is done by prStats, which uses the Hmisc print.char.matrix function to print overall model statistics if latex=FALSE, otherwise it generates customized LaTeX code. The LaTeX longtable and epic packages must be in effect to use these LaTeX functions.

reVector allows one to rename a subset of a named vector, ignoring the previous names and not concatenating them as R does. It also removes (by default) elements that are NA, as when an optional named element is fetched that doesn't exist.

formatNP is a function to format a vector of numerics. If digits is specified, formatNP will make sure that the formatted representation has digits positions to the right of the decimal place. If latex=TRUE it will translate any scientific notation to LaTeX math form. If pvalue=TRUE, it will replace formatted values with "< 0.0001" (if digits=4).

latex.naprint.delete will, if appropriate, use LaTeX to draw a dot chart of frequency of variable NAs related to model fits.

removeFormulaTerms removes one or more terms from a model formula, using strictly character manipulation. This handles problems such as [.terms removing offset() if you subset on anything. The function can also be used to remove the dependent variable(s) from the formula.

Usage

"vcov"(object, regcoef.only=TRUE, intercepts='all', ...)
"vcov"(object, regcoef.only=TRUE, ...)
"vcov"(object, regcoef.only=TRUE, intercepts='all', ...)
"vcov"(object, intercepts='all', ...)
"vcov"(object, regcoef.only=TRUE, intercepts='all', ...)
"vcov"(object, regcoef.only=TRUE, ...)
"vcov"(object, regcoef.only=TRUE, intercepts='mid', ...)
"vcov"(object, regcoef.only=TRUE, ...)
oos.loglik(fit, ...)
"oos.loglik"(fit, lp, y, ...)
"oos.loglik"(fit, lp, y, ...)
"oos.loglik"(fit, lp, y, ...)
"oos.loglik"(fit, lp, y, ...)
"oos.loglik"(fit, lp, y, ...)
Getlim(at, allow.null=FALSE, need.all=TRUE)
Getlimi(name, Limval, need.all=TRUE)
related.predictors(at, type=c("all","direct"))
interactions.containing(at, pred)
combineRelatedPredictors(at)
param.order(at, term.order)
Penalty.matrix(at, X)
Penalty.setup(at, penalty)
"logLik"(object, ...)
"logLik"(object, ...)
"logLik"(object, ...)
"AIC"(object, ..., k=2, type=c('loglik', 'chisq'))
"nobs"(object, ...)
lrtest(fit1, fit2)
"print"(x, ...)
univarLR(fit)
Newlabels(fit, ...)
Newlevels(fit, ...)
"Newlabels"(fit, labels, ...)
"Newlevels"(fit, levels, ...)
prModFit(x, title, w, digits=4, coefs=TRUE, latex=FALSE, rmarkdown=FALSE, lines.page=40, long=TRUE, needspace, ...)
prStats(labels, w, latex=FALSE, file="", append=TRUE)
reVector(..., na.rm=TRUE)
formatNP(x, digits=NULL, pvalue=FALSE, latex=FALSE)
"latex"(object, file="", append=TRUE, ...)
removeFormulaTerms(form, which=NULL, delete.response=FALSE)

Arguments

fit

result of a fitting function

object

result of a fitting function

regcoef.only

For fits such as parametric survival models which have a final row and column of the covariance matrix for a non-regression parameter such as a log(scale) parameter, setting regcoef.only=TRUE causes only the first p rows and columns of the covariance matrix to be returned, where p is the length of object$coef.

intercepts

set to "none" to omit any rows and columns related to intercepts. Set to an integer scalar or vector to include particular intercept elements. Set to 'all' to include all intercepts, or for orm to "mid" to use the default for orm. The default is to use the first for lrm and the median intercept for orm.

Design element of a fit

pred

index of a predictor variable (main effect)

fit1

fit2

fit objects from lrm,ols,psm,cph etc. It doesn't matter which fit object is the sub-model.

linear predictor vector for oos.loglik. For proportional odds ordinal logistic models, this should have used the first intercept only. If lp and y are omitted, the -2 log likelihood for the original fit are returned.

values of a new vector of responses passed to oos.loglik.

name

the name of a variable in the model

Limval

an object returned by Getlim

allow.null

prevents Getlim from issuing an error message if no limits are found in the fit or in the object pointed to by options(datadist=)

need.all

set to FALSE to prevent Getlim or Getlimi from issuing an error message if data for a variable are not found

type

For related.predictors, set to "direct" to return lists of indexes of directly related factors only (those in interactions with the predictor). For AIC.rms, type specifies the basis on which to return AIC. The default is minus twice the maximized log likelihood plus k times the degrees of freedom counting intercept(s). Specify type='chisq' to get a penalized model likelihood ratio chi-square instead.

term.order

1 for all parameters, 2 for all parameters associated with either nonlinear or interaction effects, 3 for nonlinear effects (main or interaction), 4 for interaction effects, 5 for nonlinear interaction effects.

a design matrix, not including columns for intercepts

penalty

a vector or list specifying penalty multipliers for types of model terms

the multiplier of the degrees of freedom to be used in computing AIC. The default is 2.

a result of lrtest, or the result of a high-level model fitting function (for prModFit

labels

a character vector specifying new labels for variables in a fit. To give new labels for all variables, you can specify labels of the form labels=c("Age in Years","Cholesterol"), where the list of new labels is assumed to be the length of all main effect-type variables in the fit and in their original order in the model formula. You may specify a named vector to give new labels in random order or for a subset of the variables, e.g., labels=c(age="Age in Years",chol="Cholesterol"). For prStats, is a list with major column headings, which can themselves be vectors that are then stacked vertically.

levels

a list of named vectors specifying new level labels for categorical predictors. This will override parms as well as datadist information (if available) that were stored with the fit.

title

a single character string used to specify an overall title for the regression fit, which is printed first by prModFit. Set to "" to suppress the title

For prModFit, a special list of lists, which each list element specifying information about a block of information to include in the print. output for a fit. For prStats, w is a list of statistics to print, elements of which can be vectors that are stacked vertically. Unnamed elements specify number of digits to the right of the decimal place to which to round (NA means use format without rounding, as with integers and floating point values). Negative values of digits indicate that the value is a P-value to be formatted with formatNP. Digits are recycled as needed.

digits

number of digits to the right of the decimal point, for formatting numeric values in printed output

coefs

specify coefs=FALSE to suppress printing the table of model coefficients, standard errors, etc. Specify coefs=n to print only the first n regression coefficients in the model.

latex

a logical value indicating whether information should be formatted as plain text or as LaTeX markup

file

name of file to which to write model output from print() using prStats. Default is the console.

append

specify append=FALSE when using file and you want to start over instead of adding to an existing file.

rmarkdown

set to TRUE to force latex=TRUE and to convert LaTeX code to html using Hmisc html.latex for use with RMarkdown, knitr, and RStudio

lines.page

see latex

long

set to FALSE to suppress printing of formula and certain other model output

needspace

optional character string to insert inside a LaTeX needspace macro call before the statistics table and before the coefficient matrix, to avoid bad page splits. This assumes the LaTeX needspace style is available. Example: needspace='6\baselineskip' or needspace='1.5in'.

na.rm

set to FALSE to keep NAs in the vector created by reVector

pvalue

set to TRUE if you want values below 10 to the minus digits to be formatted to be less than that value

form

a formula object

which

a vector of one or more character strings specifying the names of functions that are called from a formula, e.g., "cluster". By default no right-hand-side terms are removed.

delete.response

set to TRUE to remove the dependent variable(s) from the formula

...

other arguments. For reVector this contains the elements being extracted. For prModFit this information is passed to the Hmisc latexTabular function when a block of output is a vector to be formatted in LaTeX.

Value

vcov returns a variance-covariance matrix oos.loglik returns a scalar -2 log likelihood value. Getlim returns a list with components limits and values, either stored in fit or retrieved from the object created by datadist and pointed to in options(datadist=). related.predictors and combineRelatedPredictors return a list of vectors, and interactions.containing returns a vector. param.order returns a logical vector corresponding to non-strata terms in the model. Penalty.matrix returns a symmetric matrix with dimension equal to the number of slopes in the model. For all but categorical predictor main effect elements, the matrix is diagonal with values equal to the variances of the columns of X. For segments corresponding to c-1 dummy variables for c-category predictors, puts a c-1 x c-1 sub-matrix in Penalty.matrix that is constructed so that a quadratic form with Penalty.matrix in the middle computes the sum of squared differences in parameter values about the mean, including a portion for the reference cell in which the parameter is by definition zero. Newlabels returns a new fit object with the labels adjusted.reVector returns a vector of named (by its arguments) elements. formatNP returns a character vector.removeFormulaTerms returns a formula object.

Examples

Run this code

## Not run: 
# f <- psm(S ~ x1 + x2 + sex + race, dist='gau')
# g <- psm(S ~ x1 + sex + race, dist='gau', 
#          fixed=list(scale=exp(f$parms)))
# lrtest(f, g)
# 
# 
# g <- Newlabels(f, c(x2='Label for x2'))
# g <- Newlevels(g, list(sex=c('Male','Female'),race=c('B','W')))
# nomogram(g)
# ## End(Not run)

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples