factor is used to encode a vector as a factor (the
  terms ‘category’ and ‘enumerated type’ are also used for
  factors).  If argument ordered is TRUE, the factor
  levels are assumed to be ordered.  For compatibility with S there is
  also a function ordered. is.factor, is.ordered, as.factor and as.ordered
  are the membership and coercion functions for these classes.factor(x = character(), levels, labels = levels,
       exclude = NA, ordered = is.ordered(x), nmax = NA)ordered(x, …)
is.factor(x)
is.ordered(x)
as.factor(x)
as.ordered(x)
addNA(x, ifany = FALSE)
x might have taken.  The default is the unique set of
    values taken by as.character(x), sorted into
    increasing order of x.  Note that this set can be
    specified as smaller than sort(unique(x)).levels after removing those in
    exclude), or a character string of length 1.x, and
    will be coerced if necessary.ordered(.)): any of the above, apart from
    ordered itself.NA level if it is used, i.e.
    if any(is.na(x)).factor returns an object of class "factor" which has a
  set of integer codes the length of x with a "levels"
  attribute of mode character and unique
  (!anyDuplicated(.)) entries.  If argument ordered
  is true (or ordered() is used) the result has class
  c("ordered", "factor"). Applying factor to an ordered or unordered factor returns a
  factor (of the same type) with just the levels which occur: see also
  [.factor for a more transparent way to achieve this. is.factor returns TRUE or FALSE depending on
  whether its argument is of type factor or not.  Correspondingly,
  is.ordered returns TRUE when its argument is an ordered
  factor and FALSE otherwise. as.factor coerces its argument to a factor.
  It is an abbreviated form of factor. as.ordered(x) returns x if this is ordered, and
  ordered(x) otherwise. addNA modifies a factor by turning NA into an extra
  level (so that NA values are counted in tables, for instance)."levels" attribute.  Be careful only to compare factors with
  the same set of levels (in the same order).  In particular,
  as.numeric applied to a factor is meaningless, and may
  happen by implicit coercion.  To transform a factor f to
  approximately its original numeric values,
  as.numeric(levels(f))[f] is recommended and slightly more
  efficient than as.numeric(as.character(f)). The levels of a factor are by default sorted, but the sort order
  may well depend on the locale at the time of creation, and should
  not be assumed to be ASCII. There are some anomalies associated with factors that have
  NA as a level.  It is suggested to use them sparingly, e.g.,
  only for tabulation purposes."factor" and "ordered" methods for the
  group generic Ops which
  provide methods for the Comparison operators,
  and for the min, max, and
  range generics in Summary
  of "ordered".  (The rest of the groups and the
  Math group generate an error as they
  are not meaningful for factors.) Only == and != can be used for factors: a factor can
  only be compared to another factor with an identical set of levels
  (not necessarily in the same ordering) or to a character vector.
  Ordered factors are compared in the same way, but the general dispatch
  mechanism precludes comparing ordered and unordered factors. All the comparison operators are available for ordered factors.
  Collation is done by the levels of the operands: if both operands are
  ordered factors they must have the same level set.x is not restricted; it only must have
  an as.character method and be sortable (by
  sort.list). Ordered factors differ from factors only in their class, but methods
  and the model-fitting functions treat the two classes quite differently. The encoding of the vector happens as follows.  First all the values
  in exclude are removed from levels. If x[i]
  equals levels[j], then the i-th element of the result is
  j.  If no match is found for x[i] in levels
  (which will happen for excluded values) then the i-th element
  of the result is set to NA. Normally the ‘levels’ used as an attribute of the result are
  the reduced set of levels after removing those in exclude, but
  this can be altered by supplying labels.  This should either
  be a set of new labels for the levels, or a character string, in
  which case the levels are that character string with a sequence
  number appended. factor(x, exclude = NULL) applied to a factor is a no-operation
  unless there are unused levels: in that case, a factor with the
  reduced level set is returned.  If exclude is used it should
  also be a factor with the same level set as x or a set of codes
  for the levels to be excluded. The codes of a factor may contain NA.  For a numeric
  x, set exclude = NULL to make NA an extra
  level (prints as <NA>); by default, this is the last level. If NA is a level, the way to set a code to be missing (as
  opposed to the code of the missing level) is to
  use is.na on the left-hand-side of an assignment (as in
  is.na(f)[i] <- TRUE; indexing inside is.na does not work).
  Under those circumstances missing values are currently printed as
  <NA>, i.e., identical to entries of level NA. is.factor is generic: you can write methods to handle
  specific classes of objects, see InternalMethods. Where levels is not supplied, unique is called.
  Since factors typically have quite a small number of levels, for large
  vectors x it is helpful to supply nmax as an upper bound
  on the number of unique values.[.factor for subsetting of factors. gl for construction of balanced factors and
  C for factors with specified contrasts.
  levels and nlevels for accessing the
  levels, and unclass to get integer codes.(ff <- factor(substring("statistics", 1:10, 1:10), levels = letters))
as.integer(ff)      # the internal codes
(f. <- factor(ff))  # drops the levels that do not occur
ff[, drop = TRUE]   # the same, more transparently
factor(letters[1:20], labels = "letter")
class(ordered(4:1)) # "ordered", inheriting from "factor"
z <- factor(LETTERS[3:1], ordered = TRUE)
## and "relational" methods work:
stopifnot(sort(z)[c(1,3)] == range(z), min(z) < max(z))
## suppose you want "NA" as a level, and to allow missing values.
(x <- factor(c(1, 2, NA), exclude = NULL))
is.na(x)[2] <- TRUE
x  # [1] 1    <NA> <NA>
is.na(x)
# [1] FALSE  TRUE FALSE
## Using addNA()
Month <- airquality$Month
table(addNA(Month))
table(addNA(Month, ifany = TRUE))
Run the code above in your browser using DataLab