
Last chance! 50% off unlimited learning
Sale ends in
mChoice
is a function that is useful for grouping
variables that represent
individual choices on a multiple choice question. These choices are
typically factor or character values but may be of any type. Levels
of component factor variables need not be the same; all unique levels
(or unique character values) are collected over all of the multiple
variables. Then a new character vector is formed with integer choice
numbers separated by semicolons. Optimally, a database system would
have exported the semicolon-separated character strings with a
levels
attribute containing strings defining value labels
corresponding to the integer choice numbers. mChoice
is a
function for creating a multiple-choice variable after the fact.
mChoice
variables are explicitly handed by the describe
and summary.formula
functions. NA
s or blanks in input
variables are ignored.
format.mChoice
will convert the multiple choice representation
to text form by substituting levels
for integer codes.
as.double.mChoice
converts the mChoice
object to a
binary numeric matrix, one column per used level (or all levels of
drop=FALSE
. This is called by
the user by invoking as.numeric
. There is a
print
method and a summary
method, and a print
method for the summary.mChoice
object. The summary
method computes frequencies of all two-way choice combinations, the
frequencies of the top 5 combinations, information about which other
choices are present when each given choice is present, and the
frequency distribution of the number of choices per observation. This
summary
output is used in the describe
function.
in.mChoice
creates a logical vector the same length as x
whose elements are TRUE
when the observation in x
contains at least one of the codes or value labels in the second
argument.
match.mChoice
creats an integer vector of the indexes of all
elements in table
which contain any of the speicified levels
is.mChoice
returns TRUE
is the argument is a multiple
choice variable.
mChoice(…, label='',
sort.levels=c('original','alphabetic'),
add.none=FALSE, drop=TRUE)# S3 method for mChoice
format(x, minlength=NULL, sep=";", …)
# S3 method for mChoice
as.double(x, drop=FALSE, ...)
# S3 method for mChoice
print(x, quote=FALSE, max.levels=NULL,
width=getOption("width"), ...)
# S3 method for mChoice
as.character(x, ...)
# S3 method for mChoice
summary(object, ncombos=5, minlength=NULL, drop=TRUE, ...)
# S3 method for summary.mChoice
print(x, prlabel=TRUE, ...)
# S3 method for mChoice
[(x, ..., drop=FALSE)
match.mChoice(x, table, nomatch=NA, incomparables=FALSE)
inmChoice(x, values)
is.mChoice(x)
# S3 method for mChoice
Summary(..., na.rm)
Logical: remove NA
's from data
a vector (mChoice) of values to be matched against.
value to return if a value for x
does not exist in
table
.
logical whether incomparable values should be compaired.
a series of vectors
By default, choice codes are sorted in ascending numeric
order. Set sort=FALSE
to preserve the original left to right
ordering from the input variables.
a character string label
attribute to attach to the matrix created
by mChoice
set sort.levels="alphabetic"
to sort the columns of the matrix
created by mChoice
alphabetically by category rather than by the
original order of levels in component factor variables (if there were
any input variables that were factors)
Set add.none
to TRUE
to make a new category
'none'
if it doesn't already exist and if there is an
observations with no choices selected.
set drop=FALSE
to keep unused factor levels as columns of the matrix
produced by mChoice
an object of class "mchoice"
such as that created by
mChoice
. For is.mChoice
is any object.
an object of class "mchoice"
such as that created by
mChoice
maximum number of combos.
With of a line of text to be formated
quote the output
max levels to be displayed
By default no abbreviation of levels is done in
format
and summary
. Specify a positive integer to use
abbreviation in those functions. See abbreviate
.
character to use to separate levels when formatting
set to FALSE
to keep
print.summary.mChoice
from printing the variable label and
number of unique values
a scalar or vector. If values
is integer, it is
the choice codes, and if it is a character vector, it is assumed to
be value labels.
mChoice
returns a character vector of class "mChoice"
plus attributes "levels"
and "label"
.
summary.mChoice
returns an object of class
"summary.mChoice"
. inmChoice
returns a logical vector.
format.mChoice
returns a character vector, and
as.double.mChoice
returns a binary numeric matrix.
# NOT RUN {
options(digits=3)
set.seed(3)
n <- 20
sex <- factor(sample(c("m","f"), n, rep=TRUE))
age <- rnorm(n, 50, 5)
treatment <- factor(sample(c("Drug","Placebo"), n, rep=TRUE))
# Generate a 3-choice variable; each of 3 variables has 5 possible levels
symp <- c('Headache','Stomach Ache','Hangnail',
'Muscle Ache','Depressed')
symptom1 <- sample(symp, n, TRUE)
symptom2 <- sample(symp, n, TRUE)
symptom3 <- sample(symp, n, TRUE)
cbind(symptom1, symptom2, symptom3)[1:5,]
Symptoms <- mChoice(symptom1, symptom2, symptom3, label='Primary Symptoms')
Symptoms
print(Symptoms, long=TRUE)
format(Symptoms[1:5])
inmChoice(Symptoms,'Headache')
levels(Symptoms)
inmChoice(Symptoms, 3)
inmChoice(Symptoms, c('Headache','Hangnail'))
# Note: In this example, some subjects have the same symptom checked
# multiple times; in practice these redundant selections would be NAs
# mChoice will ignore these redundant selections
meanage <- N <- numeric(5)
for(j in 1:5) {
meanage[j] <- mean(age[inmChoice(Symptoms,j)])
N[j] <- sum(inmChoice(Symptoms,j))
}
names(meanage) <- names(N) <- levels(Symptoms)
meanage
N
# Manually compute mean age for 2 symptoms
mean(age[symptom1=='Headache' | symptom2=='Headache' | symptom3=='Headache'])
mean(age[symptom1=='Hangnail' | symptom2=='Hangnail' | symptom3=='Hangnail'])
summary(Symptoms)
#Frequency table sex*treatment, sex*Symptoms
summary(sex ~ treatment + Symptoms, fun=table)
# Check:
ma <- inmChoice(Symptoms, 'Muscle Ache')
table(sex[ma])
# could also do:
# summary(sex ~ treatment + mChoice(symptom1,symptom2,symptom3), fun=table)
#Compute mean age, separately by 3 variables
summary(age ~ sex + treatment + Symptoms)
summary(age ~ sex + treatment + Symptoms, method="cross")
f <- summary(treatment ~ age + sex + Symptoms, method="reverse", test=TRUE)
f
# trio of numbers represent 25th, 50th, 75th percentile
print(f, long=TRUE)
# }
Run the code above in your browser using DataLab