Learn R Programming

ff (version 1.0-1)

ff: Flat file database designed for large data vectors

Description

The function ff and its methods allow for handling data using a flat file with memory mapped pages. It is a constructor function for ff objects, which are numerical vectors stored in a flat file. The maximum size of the flat file is 16 GB on 32-bit platforms; however possible limitations of the file system apply.

Usage

ff(file, length = 0, pagesize = getdefaultpagesize(), readonly = FALSE)
  ## S3 method for class 'ff':
[(x, index)
  ## S3 method for class 'ff':
[(x, index) <- value
  ## S3 method for class 'ff':
dim(x)
  ## S3 method for class 'ff':
length(x)
  ## S3 method for class 'ff':
sample(x, size, replace = FALSE, prob = NULL)
  ## S3 method for class 'ff':
print(x, \dots)

Arguments

file
character string giving the name of a file to load or create.
length
size/length of double vector if object should be (re-)created.
pagesize
page size (in multiples of the system page size, see getpagesize).
readonly
boolean indicating whether the flat file should be accessed as read-only.
x
a ff object.
index
indices specifying elements to extract or replace.
value
suitable replacement value or vector of values.
size
non-negative integer giving the number of items to choose.
replace
should sampling be with replacement?
prob
a vector of probability weights for obtaining the elements of the vector being sampled. The argument prob is ignored in the sample method for ff.
...
further arguments passed to or from other methods.

Details

On 32-bit R platforms the indexing is limited to a maximum number of $2^{31}-1$. By using a multi-dimensional array, the data vector can be greater to overcome this limitation (see ffm). As ff objects are held by external pointers, they are copied as a reference. The connection life-time of the ff object and its implementation part (written in C++) is under control of the garbage collector gc. To explicitly close an ff object, one should call the garbage collector after deleting the reference(!). ff depends on the OS and file-system facilities. E.g. it is not possible to create files > 4GB on FAT32 systems. The following table gives an overview of file size limits for common file systems (see http://en.wikipedia.org/wiki/Comparison_of_file_systems for further details): ll{ File System File size limit FAT16 2GB FAT32 4GB NTFS 16GB ext2/3/4 16GB to 2TB ReiserFS 4GB (up to version 3.4) / 8TB (from version 3.5) XFS 8EB JFS 4PB HFS 2GB HFS Plus 16GB USF1 4GB to 256TB USF2 512GB to 32PB UDF 16EB }

Examples

Run this code
a <- ff("foo.ff", 8192)        # create a big vector
  a[1:10] <- rnorm(10)           # set data cells
  a[1:10]                        # get data cells

Run the code above in your browser using DataLab