Learn R Programming

RMVL (version 1.1.0.1)

Mappable Vector Library for Handling Large Datasets

Description

Mappable vector library provides convenient way to access large datasets. Use all of your data at once, with few limits. Memory mapped data can be shared between multiple R processes. Access speed depends on storage medium, so solid state drive is recommended, preferably with PCI Express (or M.2 nvme) interface or a fast network file system. The data is memory mapped into R and then accessed using usual R list and array subscription operators. Convenience functions are provided for merging, grouping and indexing large vectors and data.frames. The layout of underlying MVL files is optimized for large datasets. The vectors are stored to guarantee alignment for vector intrinsics after memory map. The package is built on top of libMVL, which can be used as a standalone C library. libMVL has simple C API making it easy to interchange datasets with outside programs. Large MVL datasets are distributed via Academic Torrents .

Copy Link

Version

Install

install.packages('RMVL')

Monthly Downloads

373

Version

1.1.0.1

License

LGPL-2.1

Issues

Pull Requests

Stars

Forks

Maintainer

Vladimir Dergachev

Last Published

September 14th, 2024

Functions in RMVL (1.1.0.1)

mvl_object_stats

Return MVL object properties
mvl_neighbors_lapply

Apply function to indices of nearby rows
mvl_compute_repeats

Find stretches of repeated rows among vectors
mvl_open

Open an MVL file
mvl_order_vectors

Return permutation sorting vector entries
mvl_write_serialized_object

Write R object in serialized form
mvl_write_object

Write R object into MVL file
mvl_hash_vectors

Return hash values for each row
mvl_status

Return status of MVL package
mvl_write_extent_index

Compute and write extent index
mvl_group_lapply

Apply function to index stretches
mvl_remap

Enlarge memory map to include recently loaded data.
mvl_merge

Merge two MVL data frames and write the result
mvl_inherits

Check inheritance of R or MVL objects
mvl_start_write_vector

Piecewise output of very long numeric and integer vectors
[.MVL

MVL handle subscription operator
print.MVL_OBJECT

Print MVL object This is a convenience function for displaying MVL_OBJECTs.
mvl_xlength

Return length of MVL or R vector as a numeric value
names.MVL

Print MVL directory
mvl_add_directory_entries

Add entries to MVL directory
mvl_get_neighbors

Retrieve indices of nearby rows.
mvl_group

Group identical rows
mvl_write_spatial_groups

Write spatial group information for each row
[[.MVL_OBJECT

MVL object subscription operator
[.MVL_OBJECT

MVL object subscription operator
mvl_class

Return underlying R class of object
mvl_write_spatial_index1

Write spatial group information for each row
names.MVL_OBJECT

Retrieve MVL object names
mvl_index_lapply

Apply function to indices of nearby rows
mvl_write_hash_vectors

Write hash values for each row
print.MVL

Print MVL
mvl_write_groups

Write group information for each row
mvl_indexed_copy

Index copy vector
mvl2R

Make sure the object is fully converted to its R representation
mvl_find_matches

Find matching rows
dim.MVL_OBJECT

Obtain dimensions of MVL object
$.MVL

MVL handle subscription operator
mvl_close

Close MVL file
length.MVL_OBJECT

Obtain length of MVL object
mvl_extent_index_lapply

Apply function to indices of rows with matching hashes
mvl_fused_write_objects

Concatenate objects and write result into MVL file.
mvl_get_groups

Retrieve indices belonging to one or more groups