### An example illustrating why care is needed ###
dataset <- c( 0,1,2, 3,4,5, 7,10,15 ) # note the uneven spread of data
x <- quantileCut( dataset, 3 ) # cut into 3 equally frequent bins
table(x) # tabulate
#
# (-0.015,2.67] (2.67,5.67] (5.67,15]
# 3 3 3
#
# Notice the uneven bin sizes: category 1 covers a range from 0 to 2.67 and
# category 2 covers a similarly sized range from 2.67 to 5.67, but the third
# category covers a much larger range, from 5.67 to 15. These categories might
# be useful in some contexts (e.g., the data are ordinal scale), but it is
# important to check that this is so.
# For comparison purposes, here is the behaviour of the more standard cut
# function when applied to the same data:
y <- cut( dataset, 3 )
table(y)
#
# (-0.015,5] (5,10] (10,15]
# 5 3 1
#
# This time the categories cover an equal range but have highly unequal
# frequencies.
Run the code above in your browser using DataLab