Hmisc (version 3.0-12)

Overview: Overview of Hmisc Library

Description

The Hmisc library contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, translating SAS datasets into S, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of S objects to LaTeX code, recoding variables, and bootstrap repeated measures analysis. Most of these functions were written by F Harrell, but a few were collected from statlib and from s-news; other authors are indicated below. This collection of functions includes all of Harrell's submissions to statlib other than the functions in the Design and display libraries. A few of the functions do not have "Help" documentation.

To make Hmisc load silently, issue options(Hverbose=FALSE) before library(Hmisc).

Arguments

Functions

ll{ Function Name Purpose abs.error.pred Computes various indexes of predictive accuracy based on absolute errors, for linear models all.is.numeric Check if character strings are legal numerics approxExtrap Linear extrapolation aregImpute Multiple imputation based on additive regression, bootstrapping, and predictive mean matching areg.boot Nonparametrically estimate transformations for both sides of a multiple additive regression, and bootstrap these estimates and $R^2$ ballocation Optimum sample allocations in 2-sample proportion test binconf Exact confidence limits for a proportion and more accurate (narrower!) score stat.-based Wilson interval (Rollin Brant, mod. FEH) bootkm Bootstrap Kaplan-Meier survival or quantile estimates bpower Approximate power of 2-sided test for 2 proportions Includes bpower.sim for exact power by simulation bpplot Box-Percentile plot (Jeffrey Banfield, umsfjban@bill.oscs.montana.edu) bsamsize Sample size requirements for test of 2 proportions bystats Statistics on a single variable by levels of >=1 factors bystats2 2-way statistics calltree Calling tree of functions (David Lubinsky, david@hoqax.att.com) character.table Shows numeric equivalents of all latin characters Useful for putting many special chars. in graph titles (Pierre Joyet, pierre.joyet@bluewin.ch) ciapower Power of Cox interaction test cleanup.import More compactly store variables in a data frame, and clean up problem data when e.g. Excel spreadsheet had a non- numeric value in a numeric column combine.levels Combine infrequent levels of a categorical variable comment Attach a comment attribute to an object: comment(fit) <- 'Used old data' comment(fit) (prints comment) confbar Draws confidence bars on an existing plot using multiple confidence levels distinguished using color or gray scale contents Print the contents (variables, labels, etc.) of a data frame cpower Power of Cox 2-sample test allowing for noncompliance Cs Vector of character strings from list of unquoted names csv.get Enhanced importing of comma separated files labels cut2 Like cut with better endpoint label construction and allows construction of quantile groups or groups with given n datadensity Snapshot graph of distributions of all variables in a data frame. For continuous variables uses scat1d. dataRep Quantify representation of new observations in a database ddmmmyy SAS "date7" output format for a chron object deff Kish design effect and intra-cluster correlation describe Function to describe different classes of objects. Invoke by saying describe(object). It calls one of the following: describe.data.frame Describe all variables in a data frame (generalization of SAS UNIVARIATE) describe.default Describe a variable (generalization of SAS UNIVARIATE) do Assists with batch analyses dot.chart Dot chart for one or two classification variables Dotplot Enhancement of Trellis dotplot allowing for matrix x-var., auto generation of Key function, superposition drawPlot Simple mouse-driven drawing program, including a function for fitting Bezier curves ecdf Empirical cumulative distribution function plot eip Edit an object "in-place" (may be dangerous!), e.g. eip(sqrt) will replace the builtin sqrt function errbar Plot with error bars (Charles Geyer, U. Chi., mod FEH) event.chart Plot general event charts (Jack Lee, jjlee@mdanderson.org, Ken Hess, Joel Dubin; Am Statistician 54:63-70,2000) event.history Event history chart with time-dependent cov. status (Joel Dubin, joel.dubin@yale.edu) find.matches Find matches (with tolerances) between columns of 2 matrices first.word Find the first word in an S expression (R Heiberger) fit.mult.impute Fit most regression models over multiple transcan imputations, compute imputation-adjusted variances and avg. betas format.df Format a matrix or data frame with much user control (R Heiberger and FE Harrell) ftupwr Power of 2-sample binomial test using Fleiss, Tytun, Ury ftuss Sample size for 2-sample binomial test using " " " " (Both by Dan Heitjan, dheitjan@biostats.hmc.psu.edu) gbayes Bayesian posterior and predictive distributions when both the prior and the likelihood are Gaussian getHdata Fetch and list datasets on our web site gs.slide Sets nice defaults for graph sheets for S-Plus 2000 for copying graphs into Microsoft applications hdquantile Harrell-Davis nonparametric quantile estimator with s.e. histbackback Back-to-back histograms (Pat Burns, Salomon Smith Barney, London, pburns@dorado.sbi.com) hist.data.frame Matrix of histograms for all numeric vars. in data frame Use hist.data.frame(data.frame.name) histSpike Add high-resolution spike histograms or density estimates to an existing plot hoeffd Hoeffding's D test (omnibus test of independence of X and Y) impute Impute missing data (generic method) interaction More flexible version of builtin function is.present Tests for non-blank character values or non-NA numeric values james.stein James-Stein shrinkage estimates of cell means from raw data labcurve Optimally label a set of curves that have been drawn on an existing plot, on the basis of gaps between curves. Also position legends automatically at emptiest rectangle. label Set or fetch a label for an S-object Lag Lag a vector, padding on the left with NA or '' latex Convert an S object to LaTeX (R Heiberger & FE Harrell) ldBands Lan-DeMets bands for group sequential tests list.tree Pretty-print the structure of any data object (Alan Zaslavsky, zaslavsk@hcp.med.harvard.edu) Load Enhancement of load mask 8-bit logical representation of a short integer value (Rick Becker) matchCases Match each case on one continuous variable matxv Fast matrix * vector, handling intercept(s) and NAs mem mem() types quick summary of memory used during session mgp.axis Version of axis() that uses appropriate mgp from mgp.axis.labels and gets around bug in axis(2, ...) that causes it to assume las=1 mgp.axis.labels Used by survplot and plot in Design library (and other functions in the future) so that different spacing between tick marks and axis tick mark labels may be specified for x- and y-axes. ps.slide, win.slide, gs.slide set up nice defaults for mgp.axis.labels. Otherwise use mgp.axis.labels('default') to set defaults. Users can set values manually using mgp.axis.labels(x,y) where x and y are 2nd value of par('mgp') to use. Use mgp.axis.labels(type=w) to retrieve values, where w='x', 'y', 'x and y', 'xy', to get 3 mgp values (first 3 types) or 2 mgp.axis.labels. minor.tick Add minor tick marks to an existing plot mtitle Add outer titles and subtitles to a multiple plot layout nomiss Return a matrix after excluding any row with an NA panel.bpplot Panel function for trellis bwplot - box-percentile plots panel.plsmo Panel function for trellis xyplot - uses plsmo pc1 Compute first prin. component and get coefficients on original scale of variables plotCorrPrecision Plot precision of estimate of correlation coefficient plsmo Plot smoothed x vs. y with labeling and exclusion of NAs Also allows a grouping variable and plots unsmoothed data popower Power and sample size calculations for ordinal responses (two treatments, proportional odds model) prn prn(expression) does print(expression) but titles the output with 'expression'. Do prn(expression,txt) to add a heading ('txt') before the 'expression' title p.sunflowers Sunflower plots (Andreas Ruckstuhl, Werner Stahel, Martin Maechler, Tim Hesterberg) ps.slide Set up postcript() using nice defaults for different types of graphics media pstamp Stamp a plot with date in lower right corner (pstamp()) Add ,pwd=T and/or ,time=T to add current directory name or time Put additional text for label as first argument, e.g. pstamp('Figure 1') will draw 'Figure 1 date' putKey Different way to use key() putKeyEmpty Put key at most empty part of existing plot rcorr Pearson or Spearman correlation matrix with pairwise deletion of missing data rcorr.cens Somers' Dyx rank correlation with censored data rcorrp.cens Assess difference in concordance for paired predictors rcspline.eval Evaluate restricted cubic spline design matrix rcspline.plot Plot spline fit with nonparametric smooth and grouped estimates rcspline.restate Restate restricted cubic spline in unrestricted form, and create TeX expression to print the fitted function recode Recodes variables reShape Reshape a matrix into 3 vectors, reshape serial data rm.boot Bootstrap spline fit to repeated measurements model, with simultaneous confidence region - least squares using spline function in time rMultinom Generate multinomial random variables with varying prob. samplesize.bin Sample size for 2-sample binomial problem (Rick Chappell, chappell@stat.wisc.edu) sas.get Convert SAS dataset to S data frame sasxport.get Enhanced importing of SAS transport dataset in R Save Enhancement of save scat1d Add 1-dimensional scatterplot to an axis of an existing plot (like bar-codes, FEH/Martin Maechler, maechler@stat.math.ethz.ch/Jens Oehlschlaegel-Akiyoshi, oehl@psyres-stuttgart.de) score.binary Construct a score from a series of binary variables or expressions sedit A set of character handling functions written entirely in S. sedit() does much of what the UNIX sed program does. Other functions included are substring.location, substring<-, replace.string.wild, and functions to check if a string is numeric or contains only the digits 0-9 setpdf Adobe PDF graphics setup for including graphics in books and reports with nice defaults, minimal wasted space setps Postscript graphics setup for including graphics in books and reports with nice defaults, minimal wasted space Internally uses psfig function by Antonio Possolo (antonio@atc.boeing.com). setps works with Ghostscript to convert .ps to .pdf setTrellis Set Trellis graphics to use blank conditioning panel strips, line thickness 1 for dot plot reference lines: setTrellis(); 3 optional arguments show.col Show colors corresponding to col=0,1,...,99 show.pch Show all plotting characters specified by pch=. Just type show.pch() to draw the table on the current device. showPsfrag Use LaTeX to compile, and dvips and ghostview to display a postscript graphic containing psfrag strings solvet Version of solve with argument tol passed to qr somers2 Somers' rank correlation and c-index for binary y spearman Spearman rank correlation coefficient spearman(x,y) spearman.test Spearman 1 d.f. and 2 d.f. rank correlation test spearman2 Spearman multiple d.f. $\rho^2$, adjusted $\rho^2$, Wilcoxon-Kruskal- Wallis test, for multiple predictors spower Simulate power of 2-sample test for survival under complex conditions Also contains the Gompertz2,Weibull2,Lognorm2 functions. spss.get Enhanced importing of SPSS files using read.spss function src src(name) = source("name.s") with memory store store an object permanently (easy interface to assign function) strmatch Shortest unique identifier match (Terry Therneau, therneau@mayo.edu) subset More easily subset a data frame substi Substitute one var for another when observations NA summarize Generate a data frame containing stratified summary statistics. Useful for passing to trellis. summary.formula General table making and plotting functions for summarizing data symbol.freq X-Y Frequency plot with circles' area prop. to frequency sys Execute unix() or dos() depending on what's running tex Enclose a string with the correct syntax for using with the LaTeX psfrag package, for postscript graphics transace ace() packaged for easily automatically transforming all variables in a matrix transcan automatic transformation and imputation of NAs for a series of predictor variables trap.rule Area under curve defined by arbitrary x and y vectors, using trapezoidal rule trellis.strip.blank To make the strip titles in trellis more visible, you can make the backgrounds blank by saying trellis.strip.blank(). Use before opening the graphics device. t.test.cluster 2-sample t-test for cluster-randomized observations uncbind Form individual variables from a matrix upData Update a data frame (change names, labels, remove vars, etc.) units Set or fetch "units" attribute - units of measurement for var. varclus Graph hierarchical clustering of variables using squared Pearson or Spearman correlations or Hoeffding D as similarities Also includes the naclus function for examining similarities in patterns of missing values across variables. xy.group Compute mean x vs. function of y by groups of x xYplot Like trellis xyplot but supports error bars and multiple response variables that are connected as separate lines win.slide Setup win.graph or win.printer using nice defaults for presentations/slides/publications wtd.mean wtd.var wtd.quantile wtd.ecdf wtd.table wtd.rank wtd.loess.noiter num.denom.setup Set of function for obtaining weighted estimates zoom Zoom in on any graphical display (Bill Dunlap, bill@statsci.com) }

Copyright Notice

GENERAL DISCLAIMER This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. In short: You may use it any way you like, as long as you don't charge money for it, remove this notice, or hold anyone liable for its results. Also, please acknowledge the source and communicate changes to the author. If this software is used is work presented for publication, kindly reference it using for example: Harrell FE (2004): Hmisc S function library. Programs available from http://biostat.mc.vanderbilt.edu/s/Hmisc. Be sure to reference S-Plus or Ritself and other libraries used.

Acknowledgements

This work was supported by grants from the Agency for Health Care Policy and Research (US Public Health Service) and the Robert Wood Johnson Foundation.

concept

overview

References

See Alzola CF, Harrell FE (2004): An Introduction to S and the Hmisc and Design Libraries at http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RS/sintro.pdf for extensive documentation and examples for the Hmisc package.