Learn R Programming

R.huge (version 0.10.1)

FileMatrix: Class representing a persistent matrix stored in a file

Description

Package: R.huge
Class FileMatrix

Object
~~|
~~+--AbstractFileArray
~~~~~~~|
~~~~~~~+--FileMatrix

Directly known subclasses:
FileByteMatrix, FileDoubleMatrix, FileFloatMatrix, FileIntegerMatrix, FileShortMatrix

public static class FileMatrix
extends AbstractFileArray

Usage

FileMatrix(..., nrow=NULL, ncol=NULL, rownames=NULL, colnames=NULL, byrow=FALSE)

Arguments

...

Arguments passed to AbstractFileArray.

nrow, ncol

The number of rows and columns of the matrix.

rownames, colnames

Optional row and column names.

byrow

If TRUE, data are stored row by row, otherwise column by column.

Fields and Methods

Methods:

[-
[<--
as.characterReturns a short string describing the file matrix.
as.matrixReturns the elements of a file matrix as an R matrix.
colnamesGets the column names of a file matrix.
getByRowChecks if elements are stored row by row or not.
getColumnOffset-
getMatrixIndicies-
getOffset-
getRowOffset-
ncolGets the number of columns of the matrix.
nrowGets the number of rows of the matrix.
readFullMatrix-
readValues-
rowMeansCalculates the means for each row.
rowSumsCalculates the sum for each row.
rownamesGets the row names of a file matrix.
writeValues-

Methods inherited from AbstractFileArray:
as.character, as.vector, clone, close, delete, dim, dimnames, finalize, flush, getBasename, getBytesPerCell, getCloneNumber, getComments, getDataOffset, getDimensionOrder, getExtension, getFileSize, getName, getPath, getPathname, getSizeOfComments, getSizeOfData, getStorageMode, isOpen, length, open, readAllValues, readContiguousValues, readHeader, readValues, setComments, writeAllValues, writeEmptyData, writeHeader, writeHeaderComments, writeValues

Methods inherited from Object:
$, $<-, [[, [[<-, as.character, attach, attachLocally, clearCache, clearLookupCache, clone, detach, equals, extend, finalize, getEnvironment, getFieldModifier, getFieldModifiers, getFields, getInstantiationTime, getStaticInstance, hasField, hashCode, ll, load, names, objectSize, print, save

Column by column or row by row?

If the matrix elements are to be accessed more often along rows, store data row by row, otherwise column by column.

Supported data types

The following subclasses implement support for various data types:

  • FileByteMatrix (1 byte per element),

  • FileShortMatrix (2 bytes per element),

  • FileIntegerMatrix (4 bytes per element),

  • FileFloatMatrix (4 bytes per element), and

  • FileDoubleMatrix (8 bytes per element).

Author

Henrik Bengtsson

Details

The purpose of this class is to be able to work with large matrices in R without being limited by the amount of memory available. Matrices are kept on the file system and elements are read and written whenever queried. The purpose of the class is not to provide methods for full matrix operations, but instead to be able to work with subsets of the matrix at each time.

For more details, AbstractFileArray.

Examples

Run this code
library("R.utils")
verbose <- Arguments$getVerbose(TRUE)

pathname <- "example.Rmatrix"
if (isFile(pathname)) {
  file.remove(pathname)
  if (isFile(pathname)) {
    stop("File not deleted: ", pathname)
  }
}

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Create a new file matrix
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
verbose && enter(verbose, "Creating new matrix")
# The dimensions of the matrix
nrow <- 20
ncol <- 5
X <- FileByteMatrix(pathname, nrow=nrow, ncol=ncol, byrow=TRUE)
verbose && exit(verbose)

verbose && enter(verbose, "Filling it with data")
rows <- c(1:4,7:10)
cols <- c(1)
x <- 1:length(rows)
writeValues(X, rows=rows, cols=cols, values=x)
verbose && exit(verbose)

verbose && enter(verbose, "Getting data again")
y <- readValues(X, rows=rows, cols=cols)
verbose && exit(verbose)
stopifnot(all.equal(x,y))

verbose && enter(verbose, "Setting data using [i,j]")
i <- c(20:18, 13:15)
j <- c(3:2, 4:5)
n <- length(i) * length(j)
values <- 1:n
X[i,j] <- values
verbose && enter(verbose, "Validating")
print(X)
print(X[])
print(X[i,j])
stopifnot(all.equal(as.vector(X[i,j]), values))
verbose && exit(verbose)
verbose && exit(verbose)


# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Open an already existing file matrix
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
verbose && enter(verbose, "Getting existing matrix")
Y <- FileByteMatrix(pathname)
verbose && exit(verbose)

print(Y[])
Y[5,1] <- 55
print(Y[])
print(X[])  # Note, X and Y refers to the same instance


# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Clone a matrix
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Z <- clone(X)
Z[5,1] <- 66
print(Z[])
print(Y[])

# Remove clone again
delete(Z)

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Close all matrices
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
close(X)
close(Y)

# Remove original matrix too
delete(X)

Run the code above in your browser using DataLab