Learn R Programming

xlsx (version 0.1.1)

read.xlsx: Read the contents of a worksheet into an R data.frame

Description

Read the contents of a worksheet into an R data.frame.

Usage

read.xlsx(file, sheetIndex, sheetName=NULL, rowIndex=NULL,
  as.data.frame=TRUE, header=TRUE, colClasses=NA, keepFormulas=FALSE)

Arguments

file
the absolute path to the file which the data are to be read from.
sheetIndex
a number representing the sheet index in the workbook.
sheetName
a character string with the sheet name.
rowIndex
a numeric vector indicating the rows you want to extract. If NULL, all the rows found will be extracted.
as.data.frame
a logical value indicating if the result should be coerced into a data.frame. If FALSE, the result is a list with one element for each non empty cell. The name for each element of this list is formed by pasting toge
header
a logical value indicating whether the first row corresponding to the first element of the rowIndex vector contains the names of the variables.
colClasses
a vector of classes to be assumed for the columns. Recycled as necessary, or if the character vector is named, unspecified values are taken to be NA.
keepFormulas
a logical value indicating if Excel formulas should be shown as text in Rand not evaluated before bringing them in.

Value

  • A data.frame containing the data as downloaded from the worksheet. The rownames of the dataframe are the unique list id's used internally by Google. You need these unique id's if you want to overwrite the rows of the spreadsheet.

Details

This function provides a high level API for reading data from an Excel 2007 worksheet. It calls several low level functions in the process. Its goal is to provide the conveniency of read.csv by borrowing from its signature.

The function pulls the value of each non empty cell in the worksheet into a vector of type list by preserving the data type. If as.data.frame=TRUE, this vector of lists is then formatted into a rectangular shape. Special care is needed for worksheets with ragged data.

The class type of the variable corresponding to one column in the worksheet is taken from the class of the first non empty cell in that column. If you need to impose a specific class type on a variable, use the colClasses argument.

Excel internally stores dates and datetimes as numeric values, and does not keep track of time zones and DST. When a datetime column is brought into R, it is converted to POSIXct class with a GMT timezone. Occasional rounding errors may appear and the Rand Excel string representation my differ by one second.

See Also

write.xlsx for writing xlsx documents. See also saveWorkbook for reading a workbook into R for further processing and saveWorkbook for saving a workbook to file.

Examples

Run this code
file <- system.file("tests", "test_import.xlsx", package = "xlsx")
res <- read.xlsx(file, 1)  # read first sheet
head(res)
#          NA. Population Income Illiteracy Life.Exp Murder HS.Grad Frost   Area
# 1    Alabama       3615   3624        2.1    69.05   15.1    41.3    20  50708
# 2     Alaska        365   6315        1.5    69.31   11.3    66.7   152 566432
# 3    Arizona       2212   4530        1.8    70.55    7.8    58.1    15 113417
# 4   Arkansas       2110   3378        1.9    70.66   10.1    39.9    65  51945
# 5 California      21198   5114        1.1    71.71   10.3    62.6    20 156361
# 6   Colorado       2541   4884        0.7    72.06    6.8    63.9   166 103766
# >

Run the code above in your browser using DataLab