Learn R Programming

RTextTools (version 1.3.8)

wizard_read_data: a simplified function for reading data from files.

Description

A simple interface for reading in data from files and creating a corpus all in one step.

Usage

wizard_read_data(filename, tablename = NULL, filetype = "csv", 
virgin=FALSE, textColumns, codeColumn, trainSize, testSize, ...)

Arguments

filename
Character string of the name of the file, include path if the file is not located in the working directory.
tablename
Microsoft Access database only. The table name in the database.
filetype
Character vector specifying the file type. Options include "csv", "tab", "accdb", "mdb" to denote .csv files, text files, or Access databases.
virgin
A logical (TRUE or FALSE) specifying whether to treat the classification data as virgin data or not. Defaults to FALSE, specifying that classification data is not virgin data.
textColumns
The a cbind() of column(s) to use for training the algorithms (e.g. cbind(data$Title)).
codeColumn
A factor or vector of labels corresponding to each document in the matrix.
trainSize
A range (e.g. 1:1000) specifying the number of documents to use for training the models.
testSize
A range (e.g. 1001:2000) specifying the number of documents to use for classification.
...
Other parameters to be passed on to create_matrix.

Value

Examples

Run this code
library(RTextTools)
corpus <- wizard_read_data(system.file("data/NYTimes.csv.gz",package="RTextTools"), 
textColumns=c("Title","Subject"), codeColumn="Topic.Code", trainSize=75, 
testSize=25, virgin=FALSE)

Run the code above in your browser using DataLab