Learn R Programming

rio (version 0.2)

import: Read data.frame or matrix from a file

Description

This function imports a data frame or matrix from a data file with the file format based on the file extension (or the manually specified format, if format is specified). import supports the following file formats:
  • Tab-separated data (.tsv), usingread.tablewithrow.names = FALSEandstringsAsFactors = FALSE(or, iffread = TRUE,fread)
  • Comma-separated data (.csv), usingread.csvwithrow.names = FALSEandstringsAsFactors = FALSE(or, iffread = TRUE,fread)
  • Pipe-separated data (.psv), usingread.tablewithsep = '|',row.names = FALSE, andstringsAsFactors = FALSE(or, iffread = TRUE,fread)
  • Fixed-width format data (.fwf), using a faster version ofread.fwfthat requires awidthsargument and by default in rio hasstringsAsFactors = FALSE
  • Serialized R objects (.rds), usingreadRDS
  • Saved R objects (.RData), usingloadfor single-object .Rdata files
  • JSON (.json), usingfromJSON
  • Stata (.dta), usingread_dta. Ifhaven = FALSE,read.dtacan be used.
  • SPSS and SPSS portable (.sav and .por), usingread_savandread_por. Ifhaven = FALSE,read.spsscan be used for .sav files.
  • "XBASE" database files (.dbf), usingread.dbf
  • Weka Attribute-Relation File Format (.arff), usingread.arff
  • R syntax object (.R), usingdget
  • Excel (.xls and .xlsx), usingread_excel. Ifreadxl = FALSE,read.xlsxcan be used.
  • SAS (.sas7bdat) and SAS XPORT (.xpt), usingread_sasandread.xport
  • Minitab (.mtp), usingread.mtp
  • Epiinfo (.rec), usingread.epiinfo
  • Systat (.syd), usingread.systat
  • Data Interchange Format (.dif), usingread.DIF
  • OpenDocument Spreadsheet (.ods), usingread.ods
  • Shallow XML documents (.xml), usingxmlToDataFrame. Note: optional arguments not recognized byxmlToDataFrameare passed toxmlParse.
  • Clipboard import (on Windows and Mac OS), usingread.tablewithrow.names = FALSE
  • Fortran data (no recognized extension), usingread.fortran

Usage

import(file, format, setclass, ...)

Arguments

file
A character string naming a file, URL, or single-file .zip or .tar archive.
format
An optional character string code of file format, which can be used to override the format inferred from file. Shortcuts include: , (for comma-separated values), ; (for semicolon-separated values), and
setclass
An optional character vector specifying one or more classes to set on the import. By default, all the return object is always a data.frame. Reasonable values for this might be tbl_df (if using dplyr) or data.table
...
Additional arguments passed to the underlying import functions. For example, this can control column classes for delimited file types, or control the use of haven for Stata and SPSS or readxl for Excel (.xlsx) format. See details below.

Value

  • An R data.frame. If setclass is used, this data.frame may have additional class attribute values.

Examples

Run this code
# create CSV to import
export(iris, "iris1.csv")

# specify `format` to override default format
export(iris, "iris.tsv", format = "csv")
stopifnot(identical(import("iris1.csv"), import("iris.tsv", format = "csv")))

# import CSV as a `data.table`
stopifnot(inherits(import("iris1.csv", setclass = "data.table"), "data.table"))
stopifnot(inherits(import("iris1.csv", setclass = "data.table"), "data.table"))

# pass arguments to underlying import function
iris1 <- import("iris1.csv")
identical(names(iris), names(iris1))

export(iris, "iris2.csv", col.names = FALSE)
iris2 <- import("iris2.csv")
identical(names(iris), names(iris2))

# set class for the response data.frame as "tbl_df" (from dplyr)
stopifnot(inherits(import("iris1.csv", setclass = "tbl_df"), "tbl_df"))

# cleanup
unlink("iris.tsv")
unlink("iris1.csv")
unlink("iris2.csv")

Run the code above in your browser using DataLab