Functions
ll{
Function Name Purpose
abs.error.pred Computes various indexes of predictive accuracy based
on absolute errors, for linear models
all.is.numeric Check if character strings are legal numerics
approxExtrap Linear extrapolation
aregImpute Multiple imputation based on additive regression,
bootstrapping, and predictive mean matching
areg.boot Nonparametrically estimate transformations for both
sides of a multiple additive regression, and
bootstrap these estimates and $R^2$
ballocation Optimum sample allocations in 2-sample proportion test
binconf Exact confidence limits for a proportion and more accurate
(narrower!) score stat.-based Wilson interval
(Rollin Brant, mod. FEH)
bootkm Bootstrap Kaplan-Meier survival or quantile estimates
bpower Approximate power of 2-sided test for 2 proportions
Includes bpower.sim for exact power by simulation
bpplot Box-Percentile plot
(Jeffrey Banfield, umsfjban@bill.oscs.montana.edu)
bsamsize Sample size requirements for test of 2 proportions
bystats Statistics on a single variable by levels of >=1 factors
bystats2 2-way statistics
calltree Calling tree of functions
(David Lubinsky, david@hoqax.att.com)
character.table Shows numeric equivalents of all latin characters
Useful for putting many special chars. in graph titles
(Pierre Joyet, pierre.joyet@bluewin.ch)
ciapower Power of Cox interaction test
cleanup.import More compactly store variables in a data frame, and clean up
problem data when e.g. Excel spreadsheet had a non-
numeric value in a numeric column
combine.levels Combine infrequent levels of a categorical variable
comment Attach a comment attribute to an object:
comment(fit) <- 'Used old data'
comment(fit) (prints comment)
confbar Draws confidence bars on an existing plot using multiple
confidence levels distinguished using color or gray scale
contents Print the contents (variables, labels, etc.) of a data frame
cpower Power of Cox 2-sample test allowing for noncompliance
Cs Vector of character strings from list of unquoted names
csv.get Enhanced importing of comma separated files labels
cut2 Like cut with better endpoint label construction and allows
construction of quantile groups or groups with given n
datadensity Snapshot graph of distributions of all variables in
a data frame. For continuous variables uses scat1d.
dataRep Quantify representation of new observations in a database
ddmmmyy SAS "date7" output format for a chron object
deff Kish design effect and intra-cluster correlation
describe Function to describe different classes of objects.
Invoke by saying describe(object). It calls one of the
following:
describe.data.frame
Describe all variables in a data frame (generalization
of SAS UNIVARIATE)
describe.default
Describe a variable (generalization of SAS UNIVARIATE)
do Assists with batch analyses
dot.chart Dot chart for one or two classification variables
Dotplot Enhancement of Trellis dotplot allowing for matrix
x-var., auto generation of Key function, superposition
drawPlot Simple mouse-driven drawing program, including a function
for fitting Bezier curves
ecdf Empirical cumulative distribution function plot
eip Edit an object "in-place" (may be dangerous!), e.g.
eip(sqrt) will replace the builtin sqrt function
errbar Plot with error bars (Charles Geyer, U. Chi., mod FEH)
event.chart Plot general event charts (Jack Lee, jjlee@mdanderson.org,
Ken Hess, Joel Dubin; Am Statistician 54:63-70,2000)
event.history Event history chart with time-dependent cov. status
(Joel Dubin, joel.dubin@yale.edu)
find.matches Find matches (with tolerances) between columns of 2 matrices
first.word Find the first word in an S expression (R Heiberger)
fit.mult.impute Fit most regression models over multiple transcan imputations,
compute imputation-adjusted variances and avg. betas
format.df Format a matrix or data frame with much user control
(R Heiberger and FE Harrell)
ftupwr Power of 2-sample binomial test using Fleiss, Tytun, Ury
ftuss Sample size for 2-sample binomial test using " " " "
(Both by Dan Heitjan, dheitjan@biostats.hmc.psu.edu)
gbayes Bayesian posterior and predictive distributions when both
the prior and the likelihood are Gaussian
getHdata Fetch and list datasets on our web site
gs.slide Sets nice defaults for graph sheets for S-Plus 2000 for
copying graphs into Microsoft applications
hdquantile Harrell-Davis nonparametric quantile estimator with s.e.
histbackback Back-to-back histograms (Pat Burns, Salomon Smith
Barney, London, pburns@dorado.sbi.com)
hist.data.frame Matrix of histograms for all numeric vars. in data frame
Use hist.data.frame(data.frame.name)
histSpike Add high-resolution spike histograms or density estimates
to an existing plot
hoeffd Hoeffding's D test (omnibus test of independence of X and Y)
impute Impute missing data (generic method)
interaction More flexible version of builtin function
is.present Tests for non-blank character values or non-NA numeric values
james.stein James-Stein shrinkage estimates of cell means from raw data
labcurve Optimally label a set of curves that have been drawn on
an existing plot, on the basis of gaps between curves.
Also position legends automatically at emptiest rectangle.
label Set or fetch a label for an S-object
Lag Lag a vector, padding on the left with NA or ''
latex Convert an S object to LaTeX (R Heiberger & FE Harrell)
ldBands Lan-DeMets bands for group sequential tests
list.tree Pretty-print the structure of any data object
(Alan Zaslavsky, zaslavsk@hcp.med.harvard.edu)
mask 8-bit logical representation of a short integer value
(Rick Becker)
matchCases Match each case on one continuous variable
matxv Fast matrix * vector, handling intercept(s) and NAs
mem mem() types quick summary of memory used during session
mgp.axis Version of axis() that uses appropriate mgp from
mgp.axis.labels and gets around bug in axis(2, ...)
that causes it to assume las=1
mgp.axis.labels
Used by survplot and plot in Design library (and other
functions in the future) so that different spacing
between tick marks and axis tick mark labels may be
specified for x- and y-axes. ps.slide, win.slide,
gs.slide set up nice defaults for mgp.axis.labels.
Otherwise use mgp.axis.labels('default') to set defaults.
Users can set values manually using
mgp.axis.labels(x,y) where x and y are 2nd value of
par('mgp') to use. Use mgp.axis.labels(type=w) to
retrieve values, where w='x', 'y', 'x and y', 'xy',
to get 3 mgp values (first 3 types) or 2 mgp.axis.labels.
minor.tick Add minor tick marks to an existing plot
mtitle Add outer titles and subtitles to a multiple plot layout
nomiss Return a matrix after excluding any row with an NA
panel.bpplot Panel function for trellis bwplot - box-percentile plots
panel.plsmo Panel function for trellis xyplot - uses plsmo
pc1 Compute first prin. component and get coefficients on
original scale of variables
plotCorrPrecision Plot precision of estimate of correlation coefficient
plsmo Plot smoothed x vs. y with labeling and exclusion of NAs
Also allows a grouping variable and plots unsmoothed data
popower Power and sample size calculations for ordinal responses
(two treatments, proportional odds model)
prn prn(expression) does print(expression) but titles the
output with 'expression'. Do prn(expression,txt) to add
a heading ('txt') before the 'expression' title
p.sunflowers Sunflower plots (Andreas Ruckstuhl, Werner Stahel,
Martin Maechler, Tim Hesterberg)
ps.slide Set up postcript() using nice defaults for different types
of graphics media
pstamp Stamp a plot with date in lower right corner (pstamp())
Add ,pwd=T and/or ,time=T to add current directory
name or time
Put additional text for label as first argument, e.g.
pstamp('Figure 1') will draw 'Figure 1 date'
putKey Different way to use key()
putKeyEmpty Put key at most empty part of existing plot
rcorr Pearson or Spearman correlation matrix with pairwise deletion
of missing data
rcorr.cens Somers' Dyx rank correlation with censored data
rcorrp.cens Assess difference in concordance for paired predictors
rcspline.eval Evaluate restricted cubic spline design matrix
rcspline.plot Plot spline fit with nonparametric smooth and grouped estimates
rcspline.restate
Restate restricted cubic spline in unrestricted form, and
create TeX expression to print the fitted function
recode Recodes variables
reShape Reshape a matrix into 3 vectors, reshape serial data
rm.boot Bootstrap spline fit to repeated measurements model,
with simultaneous confidence region - least
squares using spline function in time
rMultinom Generate multinomial random variables with varying prob.
samplesize.bin Sample size for 2-sample binomial problem
(Rick Chappell, chappell@stat.wisc.edu)
sas.get Convert SAS dataset to S data frame
sasxport.get Enhanced importing of SAS transport dataset in R
scat1d Add 1-dimensional scatterplot to an axis of an existing plot
(like bar-codes, FEH/Martin Maechler,
maechler@stat.math.ethz.ch/Jens Oehlschlaegel-Akiyoshi,
oehl@psyres-stuttgart.de)
score.binary Construct a score from a series of binary variables or
expressions
sedit A set of character handling functions written entirely
in S. sedit() does much of what the UNIX sed
program does. Other functions included are
substring.location, substring<-, replace.string.wild,
and functions to check if a string is numeric or
contains only the digits 0-9
setpdf Adobe PDF graphics setup for including graphics in books
and reports with nice defaults, minimal wasted space
setps Postscript graphics setup for including graphics in books
and reports with nice defaults, minimal wasted space
Internally uses psfig function by
Antonio Possolo (antonio@atc.boeing.com).
setps works with Ghostscript to convert .ps to .pdf
setTrellis Set Trellis graphics to use blank conditioning panel strips,
line thickness 1 for dot plot reference lines:
setTrellis(); 3 optional arguments
show.col Show colors corresponding to col=0,1,...,99
show.pch Show all plotting characters specified by pch=.
Just type show.pch() to draw the table on the
current device.
showPsfrag Use LaTeX to compile, and dvips and ghostview to
display a postscript graphic containing psfrag strings
solvet Version of solve with argument tol passed to qr
somers2 Somers' rank correlation and c-index for binary y
spearman Spearman rank correlation coefficient spearman(x,y)
spearman.test Spearman 1 d.f. and 2 d.f. rank correlation test
spearman2 Spearman multiple d.f. $\rho^2$, adjusted $\rho^2$, Wilcoxon-Kruskal-
Wallis test, for multiple predictors
spower Simulate power of 2-sample test for survival under
complex conditions
Also contains the Gompertz2,Weibull2,Lognorm2
functions.
spss.get Enhanced importing of SPSS files using read.spss
function
src src(name) = source("name.s") with memory
store store an object permanently (easy interface to assign function)
strmatch Shortest unique identifier match
(Terry Therneau, therneau@mayo.edu)
subset More easily subset a data frame
substi Substitute one var for another when observations NA
summarize Generate a data frame containing stratified summary
statistics. Useful for passing to trellis.
summary.formula General table making and plotting functions for summarizing
data
symbol.freq X-Y Frequency plot with circles' area prop. to frequency
sys Execute unix() or dos() depending on what's running
tex Enclose a string with the correct syntax for using
with the LaTeX psfrag package, for postscript graphics
transace ace() packaged for easily automatically transforming all
variables in a matrix
transcan automatic transformation and imputation of NAs for a
series of predictor variables
trap.rule Area under curve defined by arbitrary x and y vectors,
using trapezoidal rule
trellis.strip.blank
To make the strip titles in trellis more visible, you can
make the backgrounds blank by saying trellis.strip.blank().
Use before opening the graphics device.
t.test.cluster 2-sample t-test for cluster-randomized observations
uncbind Form individual variables from a matrix
upData Update a data frame (change names, labels, remove vars, etc.)
units Set or fetch "units" attribute - units of measurement for var.
varclus Graph hierarchical clustering of variables using squared
Pearson or Spearman correlations or Hoeffding D as similarities
Also includes the naclus function for examining similarities in
patterns of missing values across variables.
xy.group Compute mean x vs. function of y by groups of x
xYplot Like trellis xyplot but supports error bars and multiple
response variables that are connected as separate lines
win.slide Setup win.graph or win.printer using nice defaults for
presentations/slides/publications
wtd.mean
wtd.var
wtd.quantile
wtd.ecdf
wtd.table
wtd.rank
wtd.loess.noiter
num.denom.setup Set of function for obtaining weighted estimates
zoom Zoom in on any graphical display
(Bill Dunlap, bill@statsci.com)
}System Overrides
Hmisc overrides the system function model.frame.default
to allow for more elegant handling of NAs by allowing
the user to specify a global method for handling NAs
using options(na.action='na.methodname'). Hmisc
overrides the system subscripting method for factor
vectors and date vectors, and it defines functions
is.na.dates and is.na.times to check for NAs in date
and time vectors. The [.factor redefinition by Hmisc
causes by default unused levels to be dropped from the
factor vector's levels attribute when the vector is
subscripted. This can be overridden by using for example
x <- x[,drop=FALSE]
or by specifying a system option as
follows: options(drop.factor.levels=FALSE)
.
Hmisc also overrides the trelllis shingle function, which
has a bug when its sole argument has a class (such as
the "labelled" class created by the Hmisc label function).
The shingle replacement has the default intervals argument
set to sort(unique(unclass(x))) instead of sort(unique(x)).Copyright Notice
GENERAL DISCLAIMER
This program is free software; you can redistribute it
and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation; either
version 2, or (at your option) any later version.
This program is distributed in the hope that it will be
useful, but WITHOUT ANY WARRANTY; without even the implied
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE. See the GNU General Public License for more
details.
In short: You may use it any way you like, as long as you
don't charge money for it, remove this notice, or hold anyone liable
for its results. Also, please acknowledge the source and communicate
changes to the author.
If this software is used is work presented for publication, kindly
reference it using for example:
Harrell FE (2004): Hmisc S function library.
Programs available from http://biostat.mc.vanderbilt.edu/s/Hmisc.
Be sure to reference S-Plus or Ritself and other libraries used.Acknowledgements
This work was supported by grants
from the Agency for Health Care Policy and Research
(US Public Health Service) and the Robert Wood
Johnson Foundation.concept
overview