RODBCimplements odbc database connectivity with compliant databases where drivers exist on the host system. Two groups of commands are provided.
odbc*commands implement relatively low level access to the odbc functions of similar name.
sql*commands are higher level constructs to read, save, copy and manipulate data between data frames and sql tables. In general
sql*commands return a data frame on success, or -1/verbose on error depending on the
odbc*group return -1 in
staton error. Up to 16 connections can be open at once to any combination of dsn/hosts. Columns are limited to 255 chars of non-binary data. The functions where usage is obvious from the name are not described below.
sqlQuery(channel, query, errors = TRUE, as.is = FALSE, as = "data frame", max = 0, buffsize = 1000, nullstring = "NA", na.strings = "NA") sqlSave(channel, dat, tablename = NULL, append = FALSE, rownames = FALSE, colnames = FALSE, verbose = FALSE, test = FALSE, fast = TRUE, nastring = NULL) sqlFetch(channel, sqtable, errors = TRUE, as.is = FALSE, as = "data frame", colnames = FALSE, rownames = FALSE, nullstring = "NA", na.strings = "NA") sqlTables(channel) sqlPrimaryKeys(channel, sqtable) sqlColumns(channel, sqtable, errors = FALSE, special = FALSE) sqlDrop(channel, sqtable, errors = TRUE, verbose = TRUE) sqlClear(channel, sqtable, errors = TRUE, verbose = TRUE) sqlCopy(channel, query, destination, destchannel = -1, verbose = TRUE, errors = TRUE) sqlCopyTable(channel, srctable, desttable, destchannel = -1, verbose = TRUE, errors = TRUE) sqlGetResults(channel, as.is = FALSE, as = "data frame", errors = FALSE, max = 0, buffsize = 1000, nullstring = "NA", na.strings = "NA") sqlUpdate(channel, dat, verbose = FALSE, test = FALSE, nastring = NULL)
odbcConnect(dsn, uid="", pwd="", host="localhost", case="nochange") odbcClose(channel)
odbcClearError(channel) odbcQuery(channel, query) odbcTables(channel) odbcGetErrMsg(channel) odbcColumns(channel, table) odbcPrimaryKeys(channel, table) odbcFetchRow(channel) odbcFetchRows(channel, max = 0, transposing = FALSE, buffsize = 1000, nullstring = "NA") odbcColData(channel) odbcNumRows(channel) odbcNumFields(channel) odbcNumCols(channel)
"transposed matrix", the latter two being character matrices as returned by the DBMS.
NAs to the database. For
fast = TRUEit is not possible to write
SQL_NULL_DATA(which is done for
fast = FALSE), so
SQL_NULL_DATAitems from the database.
NAwhen reading data.
odbcConnectestablishes a connection to the dsn at
host. It returns a integer, which is used as handle if no error occurred, -1 otherwise. For databases that translate table and column names to case must be set as appropriate. Allowable values are
toloweras well as the names of databases where the behaviour is known to me (currently
sqlQuery is the workhorse function. It sends the SQL statement
query to the server, using connection
channel, returned by
odbcConnect. Returns a data frame of results, transformed
errors=FALSE returns -1 on error, otherwise
halts with a message from the server.
transposing reverses columns
and rows if
buffsize will yield a marginal increase in
speed if increased for some database engines eg MSaccess. SQL beginners
should note that the term `Query' includes any valid SQL statement
including table creation, alteration, updates etc as well as SELECTs.
sqlQuery command is a convenience wrapper that calls first
odbcQuery and then
sqlGetResults. If finer grained control,
for example over the number of rows fetched, these functions should be
sqlGetResults is a mid-level function. It should be called
after a call to
odbcQuery and used to retrieve waiting results into a dataframe.
Its main use is with
max set to non zero it will retrieve the
result set in batches with repeated calls. This is useful for very large
result sets which can be subjected to intermediate processing.
sqlSave saves the
dat in the table
dat. The table name is
taken from tablename if given or the name of the dataframe.
If the table exists and
has the appropriate structure it is used, or else it is created anew
rownames=TRUE the first column of
the table will be the row labels with colname
can also be a string giving the desired name (see example).
copied the column names into row 1. This is intended for cases where
case conversion alters the original column names and it is desired that
they are retained. Note that there are drawbacks to this approach:
it presupposes that the rows will be returned in correct order;
not always valid.
It will also cause numeric rows to be returned as factors.
sqlSave uses the 'great white shark'
method of testing tables (bite it and see). The logic will unceremoniously
DROP the table and create it anew with VARCHAR column types in its attempt
to find a writeable solution.
test=TRUE will not necessarily predict
this behaviour. Attempting to write indexed columns or writing to pseudo-
columns are less obvious causes of failed writes followed by a DROP.
If your table structure is precious to you back it up.
sqlFetch loads the
the entire contents of the table
dat. (The reverse of
sqlSave) Rownames and column names are restored as indicated.
(More accurately the first row and column returned is transferred
to the row/col names).
sqlCopy as above,
but saves the output of
query in table
copies the structure of
desttable on dsn
destchannel. This is within the limitation of the odbc lowest
common denominator. More precise control is possible via
the content of the table
sqtable. No confirmation is requested.
sqtable. No confirmation is requested.
sqlUpdate updates the table
where the rows already exist. The dataframe must contain a column
named after the row that the database regards as teh optimal for
defining a row uniquely. (This is returned by
information as data frames. Note that the column names contain
underscores and are invalid in S unless quoted. The column names are
not constant across ODBC versions so the data should be accessed by
column number. The argument
returns the rows needed to specify a row uniquely. This is intended
to form the basis of a WHERE clause for updates.
odbcClose(channel) Clean up and free resources.
odbcFetchRows This function returns a matrix of the pending
$data limited to
max rows if
max is greater than 0.
buffsize may be increased from the
default of 1000 (rows*cols) for increased performance in a large dataset.
This only has an effect with servers that do not return the number of rows
affected by a query e.g. MSAccess, MSSqlServer.
TRUE the matrix will be transposed.
This function is called by sqlGetResults, which then converts the
matrix to a dataframe. This step incurs a significant performance
penalty and working with matrices is much faster in large rowsets.
odbcFetchRow(channel) is a deprecated function that returns
a vector comprising the next row of the waiting rowset.
The remaining functions beginning
odbc are lower level
functions that normally
require explicit looping to deal with the results. Most return -1 on
failure, indicating that a message is waiting for
The exception is that an invalid channel returns -2. Examples are
present in the
sqlGetResults returns the
last result set created by an
odbc* call as a data frame.
library(RODBC) data(USArrests) channel <- odbcConnect("test", "", "") # userId and password sqlSave(channel, USArrests, rownames = "State", verbose = TRUE) options("dec",".") # this is the default decimal point sqlQuery(channel, "select State, Murder from USArrests where rape > 30 order by Murder") sqlFetch(channel, "USArrests") # get the lot sqlDrop(channel, USArrests) odbcClose(channel) rm(USArrests)
Run the code above in your browser using DataCamp Workspace