##### Formatting Using C-style Formats

Formatting numbers individually and flexibly, formatC() using C style format specifications.

prettyNum() is used for “prettifying” (possibly formatted) numbers, also in format.default.

.format.zeros(), an auxiliary function of prettyNum() re-formats the zeros in a vector x of formatted numbers.

##### Usage
formatC(x, digits = NULL, width = NULL,
format = NULL, flag = "", mode = NULL,
big.mark = "", big.interval = 3L,
small.mark = "", small.interval = 5L,
decimal.mark = getOption("OutDec"),
preserve.width = "individual", zero.print = NULL,
drop0trailing = FALSE)prettyNum(x, big.mark = "",   big.interval = 3L,
small.mark  = "", small.interval = 5L,
decimal.mark = getOption("OutDec"), input.d.mark = decimal.mark,
preserve.width = c("common", "individual", "none"),
zero.print = NULL, drop0trailing = FALSE, is.cmplx = NA,
…).format.zeros(x, zero.print, nx = suppressWarnings(as.numeric(x)))
##### Arguments
x

an atomic numerical or character object, possibly complex only for prettyNum(), typically a vector of real numbers. Any class is discarded, with a warning.

digits

the desired number of digits after the decimal point (format = "f") or significant digits (format = "g", = "e" or = "fg").

Default: 2 for integer, 4 for real numbers. If less than 0, the C default of 6 digits is used. If specified as more than 50, 50 will be used with a warning unless format = "f" where it is limited to typically 324. (Not more than 15--21 digits need be accurate, depending on the OS and compiler used. This limit is just a precaution against segfaults in the underlying C runtime.)

width

the total field width; if both digits and width are unspecified, width defaults to 1, otherwise to digits + 1. width = 0 will use width = digits, width < 0 means left justify the number in this field (equivalent to flag = "-"). If necessary, the result will have more characters than width. For character data this is interpreted in characters (not bytes nor display width).

format

equal to "d" (for integers), "f", "e", "E", "g", "G", "fg" (for reals), or "s" (for strings). Default is "d" for integers, "g" for reals.

"f" gives numbers in the usual xxx.xxx format; "e" and "E" give n.ddde+nn or n.dddE+nn (scientific format); "g" and "G" put x[i] into scientific format only if it saves space to do so.

"fg" uses fixed format as "f", but digits as the minimum number of significant digits. This can lead to quite long result strings, see examples below. Note that unlike signif this prints large numbers with more significant digits than digits. Trailing zeros are dropped in this format, unless flag contains "#".

flag

for formatC, a character string giving a format modifier as in Kernighan and Ritchie (1988, page 243) or the C+99 standard. "0" pads leading zeros; "-" does left adjustment, others are "+", " ", and "#"; on some platform--locale combination, "'" activates “thousands' grouping” for decimal conversion, and versions of glibc allow "I" for integer conversion to use the locale's alternative output digits, if any.

There can be more than one of these, in any order. Other characters used to have no effect for character formatting, but signal an error since R 3.4.0.

mode

"double" (or "real"), "integer" or "character". Default: Determined from the storage mode of x.

big.mark

character; if not empty used as mark between every big.interval decimals before (hence big) the decimal point.

big.interval

see big.mark above; defaults to 3.

small.mark

character; if not empty used as mark between every small.interval decimals after (hence small) the decimal point.

small.interval

see small.mark above; defaults to 5.

decimal.mark

the character to be used to indicate the numeric decimal point.

input.d.mark

if x is character, the character known to have been used as the numeric decimal point in x.

preserve.width

string specifying if the string widths should be preserved where possible in those cases where marks (big.mark or small.mark) are added. "common", the default, corresponds to format-like behavior whereas "individual" is the default in formatC(). Value can be abbreviated.

zero.print

logical, character string or NULL specifying if and how zeros should be formatted specially. Useful for pretty printing ‘sparse’ objects.

drop0trailing

logical, indicating if trailing zeros, i.e., "0" after the decimal mark, should be removed; also drops "e+00" in exponential formats.

is.cmplx

optional logical, to be used when x is "character" to indicate if it stems from complex vector or not. By default (NA), x is checked to ‘look like’ complex.

arguments passed to format.

nx

numeric vector of the same length as x, typically the numbers of which the character vector x is the pre-format.

##### Details

If you set format it overrides the setting of mode, so formatC(123.45, mode = "double", format = "d") gives 123.

The rendering of scientific format is platform-dependent: some systems use n.ddde+nnn or n.dddenn rather than n.ddde+nn.

formatC does not necessarily align the numbers on the decimal point, so formatC(c(6.11, 13.1), digits = 2, format = "fg") gives c("6.1", " 13"). If you want common formatting for several numbers, use format.

prettyNum is the utility function for prettifying x. x can be complex (or format(<complex>)), here. If x is not a character, format(x[i], ...) is applied to each element, and then it is left unchanged if all the other arguments are at their defaults. Use the input.d.mark argument for prettyNum(x) when x is a character vector not resulting from something like format(<number>) with a period as decimal mark.

Because gsub is used to insert the big.mark and small.mark, special characters need escaping. In particular, to insert a single backslash, use "\\\\".

The C doubles used for R numerical vectors have signed zeros, which formatC may output as -0, -0.000 ….

There is a warning if big.mark and decimal.mark are the same: that would be confusing to those reading the output.

##### Value

A character object of same size and attributes as x (after discarding any class), in the current locale's encoding.

Unlike format, each number is formatted individually. Looping over each element of x, the C function sprintf(…) is called for numeric inputs (inside the C function str_signif).

formatC: for character x, do simple (left or right) padding with white space.

##### Note

The default for decimal.mark in formatC() was changed in R 3.2.0: for use within print methods in packages which might be used with earlier versions: use decimal.mark = getOption("OutDec") explicitly.

##### References

Kernighan, B. W. and Ritchie, D. M. (1988) The C Programming Language. Second edition. Prentice Hall.

format.

sprintf for more general C-like formatting.

##### Examples
library(base) # NOT RUN { xx <- pi * 10^(-5:4) cbind(format(xx, digits = 4), formatC(xx)) cbind(formatC(xx, width = 9, flag = "-")) cbind(formatC(xx, digits = 5, width = 8, format = "f", flag = "0")) cbind(format(xx, digits = 4), formatC(xx, digits = 4, format = "fg")) formatC( c("a", "Abc", "no way"), width = -7) # <=> flag = "-" formatC(c((-1:1)/0,c(1,100)*pi), width = 8, digits = 1) ## note that some of the results here depend on the implementation ## of long-double arithmetic, which is platform-specific. xx <- c(1e-12,-3.98765e-10,1.45645e-69,1e-70,pi*1e37,3.44e4) ## 1 2 3 4 5 6 formatC(xx) formatC(xx, format = "fg") # special "fixed" format. formatC(xx[1:4], format = "f", digits = 75) #>> even longer strings formatC(c(3.24, 2.3e-6), format = "f", digits = 11, drop0trailing = TRUE) r <- c("76491283764.97430", "29.12345678901", "-7.1234", "-100.1","1123") ## American: prettyNum(r, big.mark = ",") ## Some Europeans: prettyNum(r, big.mark = "'", decimal.mark = ",") (dd <- sapply(1:10, function(i) paste((9:0)[1:i], collapse = ""))) prettyNum(dd, big.mark = "'") ## examples of 'small.mark' pN <- stats::pnorm(1:7, lower.tail = FALSE) cbind(format (pN, small.mark = " ", digits = 15)) cbind(formatC(pN, small.mark = " ", digits = 17, format = "f")) cbind(ff <- format(1.2345 + 10^(0:5), width = 11, big.mark = "'")) ## all with same width (one more than the specified minimum) ## individual formatting to common width: fc <- formatC(1.234 + 10^(0:8), format = "fg", width = 11, big.mark = "'") cbind(fc) ## Powers of two, stored exactly, formatted individually: pow.2 <- formatC(2^-(1:32), digits = 24, width = 1, format = "fg") ## nicely printed (the last line showing 5^32 exactly): noquote(cbind(pow.2)) ## complex numbers: r <- 10.0000001; rv <- (r/10)^(1:10) (zv <- (rv + 1i*rv)) op <- options(digits = 7) ## (system default) (pnv <- prettyNum(zv)) stopifnot(pnv == "1+1i", pnv == format(zv), pnv == prettyNum(zv, drop0trailing = TRUE)) ## more digits change the picture: options(digits = 8) head(fv <- format(zv), 3) prettyNum(fv) prettyNum(fv, drop0trailing = TRUE) # a bit nicer options(op) ## The ' flag : doLC <- FALSE # R warns, so change to TRUE manually if you want see the effect if(doLC) oldLC <- Sys.setlocale("LC_NUMERIC", "de_CH.UTF-8") formatC(1.234 + 10^(0:4), format = "fg", width = 11, flag = "'") ## --> ..... " 1'001" " 10'001" on supported platforms if(doLC) ## revert, typically to "C" : Sys.setlocale("LC_NUMERIC", oldLC) # } 
### Community examples

richie@datacamp.com at Jan 22, 2017 base v3.3.2

# formatC() formatC() is based upon the C/C++ function printf(). The [documentation](http://www.cplusplus.com/reference/cstdio/printf) for that function contains details on how many of the arguments to formatC() work. ## Basic usage formatC() takes a numeric vector as an input and returns a character vector. With no arguments other than x, formatC() works like [signif()](https://www.rdocumentation.org/packages/base/topics/signif) + [as.character()](https://www.rdocumentation.org/packages/base/topics/character), (though with different rules for rounding). {r} (x <- 1.2345 * 10 ^ (-4:4)) formatC(x) as.character(signif(x, 4))  ## digits and format arguments These arguments affect each other, and need to be discussed together. format determines whether the numbers should be displayed using fixed, scientific, or integer-like formatting. digits determines either the number of _significant digits_ or the number of _decimal paces_ to round the numbers to, depending upon the value of format. Note that the documentation is apparently wrong (as of v3.3.2) – the default value of digits is always 4, even for integer inputs. format = "d" treats the numbers as integers, so nothing after the decimal place is shown. Conversion to integer using [as.integer()](https://www.rdocumentation.org/packages/base/topics/integer) is shown for comparison. {r} (x <- c(1.2345, 6.789) * 10 ^ (-2:3) * c(1, -1, 1)) formatC(x, format = "d") as.character(as.integer(x))  format = "e" uses scientific formatting. In this case, digits refers to the number of significant digits to round each number to. See [signif()](https://www.rdocumentation.org/packages/base/topics/signif) for comparison. {r} (x <- 1.2345 * 10 ^ (-4:4)) formatC(x, format = "e", digits = 1) formatC(x, format = "e", digits = 2) formatC(x, format = "e", digits = 3) formatC(x, format = "e") # implicitly digits = 4 formatC(x, format = "e", digits = 5)  format = "E" is a variant of format = "e" that prints an upper-case "E" to denote the start of the exponent. {r} (x <- 1.2345 * 10 ^ (-4:4)) formatC(x, format = "E")  format = "f" uses fixed formatting. In this case, digits refers to the number of decimal places to round each number to. See [round()](https://www.rdocumentation.org/packages/base/topics/round) for comparison. {r} (x <- 1.2345 * 10 ^ (-4:4)) formatC(x, format = "f", digits = 1) formatC(x, format = "f", digits = 2) formatC(x, format = "f", digits = 3) formatC(x, format = "f") # implicitly digits = 4 formatC(x, format = "f", digits = 5)  format = "g" automatically mixes fixed or scientific, trying to choose whichever takes the least amount of space. digits refers to significant digits. {r} (x <- 1.2345 * 10 ^ (-4:4)) formatC(x, format = "g", digits = 1) formatC(x, format = "g", digits = 2) formatC(x, format = "g", digits = 3) formatC(x, format = "g") # implicitly digits = 4 formatC(x, format = "g", digits = 5)  format = "G" is a variant of format = "g" that prints an upper-case "E" to denote the start of the exponent in scientifically formatted values. {r} (x <- 1.2345 * 10 ^ (-4:4)) formatC(x, format = "G")  format = "fg" works like format = "g", except that digits refers to decimal places. In this case, trailing zeroes are dropped. {r} (x <- 1.2345 * 10 ^ (-4:4)) formatC(x, format = "fg", digits = 1) formatC(x, format = "fg", digits = 2) formatC(x, format = "fg", digits = 3) formatC(x, format = "fg") # implicitly digits = 4 formatC(x, format = "fg", digits = 5)  Negative values of digits are treated as digits = 6. {r} (x <- 1.2345 * 10 ^ (-4:4)) formatC(x, format = "e", digits = -1) formatC(x, format = "f", digits = -1)  The largest allowed value of digits is 324 when format = "f", and 50 otherwise. Note that under most circumstances, you won't be better than 15 significant figures of accuracy, so the rest will be gibberish. {r} (x <- 1.2345 * 10 ^ (-4:4)) formatC(x, format = "f", digits = 325) formatC(x, format = "e", digits = 51)  format = "s" is a little different. It is designed to be used with an input that is already contains strings (rather than numbers). It's main purpose is to be used with the width argument (see below) to pad strings to a specified number of characters. ## width argument width specifies the minimum number of characters of each element of the result. If the number is too short, it is padded on the left with spaces. This is mostly useful for getting columns of numbers to line up. {r} (x <- 1.2345 * 10 ^ (-4:4)) formatC(x, format = "f", width = 6) formatC(x, format = "f", width = 7) formatC(x, format = "f", width = 8) formatC(x, format = "f", width = 9)  You can specify a negative width to pad on the right instead. {r} (x <- 1.2345 * 10 ^ (-4:4)) formatC(x, format = "f", width = -6) formatC(x, format = "f", width = -7) formatC(x, format = "f", width = -8) formatC(x, format = "f", width = -9)  width = 0 is interpreted as width = digits, in which case you should never have padding. {r} (x <- 1.2345 * 10 ^ (-4:4)) formatC(x, format = "f", digits = 8, width = 0) formatC(x, format = "f", digits = 8, width = 8)  ## flag argument There are five modifying flags available: +,  , -, 0, and #. flag takes a sting containing zero or more of these flags in any order. flag = "+" means that positive numbers are prefixed with a plus sign. {r} (x <- c(1, -1, 1) * 1.2345 * 10 ^ (-4:4)) formatC(x, flag = "+")  flag = "+" means that positive numbers are prefixed with a space. When flag = "+ ", the plus sign takes priority. {r} (x <- c(1, -1, 1) * 1.2345 * 10 ^ (-4:4)) formatC(x, flag = " ") # space flag ignored formatC(x, flag = "+ ") formatC(x, flag = " +")  flag = "-" means pad short strings on the right, exactly like using a negative width. Using flag = "-" and a negative width _doesn't_ double reverse the padding; it still pads on the right. {r} (x <- 1.2345 * 10 ^ (-4:4)) formatC(x, width = 10, flag = "-") formatC(x, width = -10, flag = "-")  flag = "0" means pad short strings with zeroes, not spaces. It cannot be used in combination with flag = "-" to pad with zeroes on the right. {r} (x <- 1.2345 * 10 ^ (-4:4)) formatC(x, width = 10, flag = "0") formatC(x, width = 10, flag = "-0") # zero flag ignored  flag = "#" prints a trailing decimal point in the case where digits = 0. {r} (x <- 1.2345 * 10 ^ (-4:4)) formatC(x, digits = 0, format = "e") formatC(x, digits = 0, format = "e", flag = "#") formatC(x, digits = 0, format = "f") formatC(x, digits = 0, format = "f", flag = "#") formatC(x, digits = 0, format = "g") formatC(x, digits = 0, format = "g", flag = "#")  ## big.mark and big.interval arguments When the format is a fixed type (format %in% c("d", "f", "fg")), big.mark and big.interval control separators of digits before the decimal point. For example, in the USA, monetary quantities are often separated the a comma after every 3 digits. In France, monetary quantities are separated by a space every three digits. See [Sys.localeconv()](https://www.rdocumentation.org/packages/base/topics/Sys.localeconv) and the C++ documentation for [lconv structures](http://www.cplusplus.com/reference/clocale/lconv) describe the correct formatting for numbers in a given locale. In particular, note the thousands_sep, grouping, mon_thousands_sep, and mon_grouping elements. {r} (x <- 1.2345 * 10 ^ (1:10)) formatC(x, format = "d", big.mark = ",", big.interval = 3) formatC(x, format = "d", big.mark = " ", big.interval = 3)  ## small.mark, small.interval, and decimal.mark arguments These work in in a similar way to big.mark and big.interval, but they control the decimal place, and what happens afterwards. Use of these arguments only makes sense when format %in% c("f", "fg"). {r} (x <- 1.2345 * 10 ^ (0:-10)) formatC(x, format = "fg", decimal.mark = ".", small.mark = ",", small.interval = 3) formatC(x, format = "fg", decimal.mark = ",", small.mark = " ", small.interval = 3)