⚠️There's a newer version (0.10.1) of this package. Take me there.

readstata13

Package to read and write all Stata file formats (version 15 and older) into a R data.frame. The dta file format versions 102 to 118 are supported.

The function read.dta from the foreign package imports only dta files from Stata versions <= 12. Due to the different structure and features of dta 117 files, we wrote a new file reader in Rcpp.

Additionally the package supports many features of the Stata dta format like label sets in different languages (?set.lang) or business calendars (?as.caldays).

Installation

The package is now hosted on CRAN.

install.packages("readstata13")

Usage

library(readstata13)
dat <- read.dta13("path to file.dta")
save.dta13(dat, file="newfile.dta")

Development Version

To install the current release from github you need the plattform specific build tools. On Windows a current installation of Rtools is necessary, while OS X users need to install Xcode.

# install.packages("devtools")
devtools::install_github("sjewo/readstata13", ref="0.9.2")

Older Versions of devtools require a username option:

install_github("readstata13", username="sjewo", ref="0.9.2")

To install the current development version from github:

devtools::install_github("sjewo/readstata13", ref="testing")

Current Status

Changelog and Features

VersionChanges
0.9.2Fix Build on MacOS X
0.9.1Allow reading only pre-selected variables
0.9.1Experimental support for format 119
0.9.1Improvements to partial reading. Idea by Kevin Jin
0.9.1Export of binary data from dta-files
0.9.1new function get.label.tables() to show all Stata label sets
0.9.1Fix check for duplicate labels and in set.lang()
0.9.0Generate unique factor labels to prevent errors in factor definition
0.9.0check interrupt for long read. Patch by Giovanni Righi
0.9.0Updates to notes, roxygen and register
0.9.0Fixed size of character length. Bug reported by Yiming (Paul) Li
0.9.0Fix saving characters containing missings. Bug reported by Eivind H. Olsen
0.9.0Adjustments to convert.underscore. Patch by luke-m-olson
0.9.0Allow partial reading of selected rows
0.8.5Fix errors on big-endians systems
0.8.4Fix valgrind errors. converting from dta.write to writestr
0.8.4Fix for empty data label
0.8.4Make replace.strl default
0.8.3Restrict length of varnames to 32 chars for compatibility with Stata 14
0.8.3Add many function tests
0.8.3Avoid converting of double to floats while writing compressed files
0.8.2Save NA values in character vector as empty string
0.8.2Convert.underscore=T will convert all non-literal characters to underscores
0.8.2Fix saving of Dates
0.8.2Save with convert.factors by default
0.8.2Test for NaN and inf values while writing missing values and replace with NA
0.8.2Remove message about saving factors
0.8.1Convert non-integer variables to factors (nonint.factors=T)
0.8.1Handle large datasets
0.8.1Working with strL variables is now a lot faster
<0.8.1Reading data files from disk or url and create a data.frame
<0.8.1Saving dta files to disk - most features of the dta file format are supported
<0.8.1Assign variable names
<0.8.1Read the new strL strings and save them as attribute
<0.8.1Convert stata label to factors and save them as attribute
<0.8.1Read some meta data (timestamp, dataset label, formats,...)
<0.8.1Convert strings to system encoding
<0.8.1Handle different NA values
<0.8.1Handle multiple label languages
<0.8.1Convert dates
<0.8.1Reading business calendar files

Test

Since our attributes differ from foreign::read.dta all.equal and identical report false. If you check the values, everything is identical.

library("foreign")
r12 <- read.dta("http://www.stata-press.com/data/r12/auto.dta")
r13 <- read.dta13("http://www.stata-press.com/data/r13/auto.dta")

Map(identical,r12,r13)

att <- names(attributes(r12))
for (i in seq(att))
	cat(att[i],":", all.equal(attr(r12,att[i]),attr(r13,att[i])),"\n")

r12 <- read.dta("http://www.stata-press.com/data/r12/auto.dta",convert.factors=F)
r13 <- read.dta13("http://www.stata-press.com/data/r13/auto.dta",convert.factors=F)

Map(identical,r12,r13)

Authors

Marvin Garbuszus (JanMarvin) and Sebastian Jeworutzki (sjewo)

Licence

GPL2

Copy Link

Version

Down Chevron

Install

install.packages('readstata13')

Monthly Downloads

21,394

Version

0.9.2

License

GPL-2 | file LICENSE

Issues

Pull Requests

Stars

Forks

Last Published

May 26th, 2018

Functions in readstata13 (0.9.2)