Learn R Programming

⚠️There's a newer version (1.10.0) of this package.Take me there.

arkhe

Overview

A dependency-free collection of simple functions for cleaning rectangular data. This package allows to detect, count and replace values or discard rows/columns using a predicate function. In addition, it provides tools to check conditions and return informative error messages.


To cite arkhe in publications use:

Frerebeau N (2024). arkhe: Tools for Cleaning Rectangular Data. Université Bordeaux Montaigne, Pessac, France. doi:10.5281/zenodo.3526659 https://doi.org/10.5281/zenodo.3526659, R package version 1.7.0, https://packages.tesselle.org/arkhe/.

This package is a part of the tesselle project https://www.tesselle.org.

Installation

You can install the released version of arkhe from CRAN with:

install.packages("arkhe")

And the development version from GitHub with:

# install.packages("remotes")
remotes::install_github("tesselle/arkhe")

Usage

## Load the package
library(arkhe)

## Create a matrix
X <- matrix(sample(1:10, 25, TRUE), nrow = 5, ncol = 5)

## Add NA
k <- sample(1:25, 3, FALSE)
X[k] <- NA
X
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    7    8    9    1   NA
#> [2,]    8    9    7    2    3
#> [3,]    8   NA    2   10    2
#> [4,]    4    7    7    5    3
#> [5,]    4    7   NA    6    1

## Count missing values in rows
count(X, f = is.na, margin = 1)
#> [1] 1 0 1 0 1
## Count non-missing values in columns
count(X, f = is.na, margin = 2, negate = TRUE)
#> [1] 5 4 4 5 4

## Find row with NA
detect(X, f = is.na, margin = 1)
#> [1]  TRUE FALSE  TRUE FALSE  TRUE
## Find column without any NA
detect(X, f = is.na, margin = 2, negate = TRUE, all = TRUE)
#> [1]  TRUE FALSE FALSE  TRUE FALSE

## Remove row with any NA
discard(X, f = is.na, margin = 1, all = FALSE)
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    8    9    7    2    3
#> [2,]    4    7    7    5    3
## Remove column with any NA
discard(X, f = is.na, margin = 2, all = FALSE)
#>      [,1] [,2]
#> [1,]    7    1
#> [2,]    8    2
#> [3,]    8   10
#> [4,]    4    5
#> [5,]    4    6

## Replace NA with zeros
replace_NA(X, value = 0)
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    7    8    9    1    0
#> [2,]    8    9    7    2    3
#> [3,]    8    0    2   10    2
#> [4,]    4    7    7    5    3
#> [5,]    4    7    0    6    1

Contributing

Please note that the arkhe project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Copy Link

Version

Install

install.packages('arkhe')

Monthly Downloads

1,015

Version

1.7.0

License

GPL (>= 3)

Issues

Pull Requests

Stars

Forks

Maintainer

Nicolas Frerebeau

Last Published

July 29th, 2024

Functions in arkhe (1.7.0)

assert_length

Check Object Length/Dimensions
assert_package

Check the Availability of a Package
clean_whitespace

Remove Leading/Trailing Whitespace
check_class

Class Diagnostic
assert_square

Check Matrix
assert_type

Check Data Types
count

Count Values Using a Predicate
math_lcm

Least Common Multiple
confidence_binomial

Confidence Interval for Binomial Proportions
discard

Remove Rows/Columns Using a Predicate
null

Default value for NULL
detect

Find Rows/Columns Using a Predicate
predicate-utils

Utility Predicates
is_scalar

Scalar Type Predicates
compact

Remove Empty Rows/Columns
interval_hdr

Highest Density Regions
jackknife

Jackknife Estimation
keep

Keep Rows/Columns Using a Predicate
concat

Concatenate
remove_Inf

Remove Rows/Columns with Infinite Values
replace_empty

Replace Empty String
assign

Assign a Specific Row/Column to the Column/Row Names
describe

Data Description
conditions

Conditions
get

Get Rows/Columns by Name
math_gcd

Greatest Common Divisor
label_percent

Label Percentages
bootstrap

Bootstrap Estimation
confidence_multinomial

Confidence Interval for Multinomial Proportions
confidence_mean

Confidence Interval for a Mean
replace_Inf

Replace Infinite Values
interval_credible

Bayesian Credible Interval
predicate-numeric

Numeric Predicates
remove_constant

Remove Constant Columns
remove_empty

Remove Rows/Columns with Empty String
seek

Search Rows/Columns by Name
remove_NA

Remove Rows/Columns with Missing Values
remove_zero

Remove Rows/Columns with Zeros
sparsity

Sparsity
with_seed

Evaluate an Expression with a Temporarily Seed
validate

Validate a Condition
predicate-matrix

Matrix Predicates
scale_midpoint

Rescale Continuous Vector (minimum, midpoint, maximum)
replace_NA

Replace Missing Values
scale_range

Rescale Continuous Vector (minimum, maximum)
replace_zero

Replace Zeros
predicate-type

Type Predicates
predicate-trend

Numeric Trend Predicates
assert_numeric

Check Numeric Values
assert_lower

Check Numeric Relations
arkhe-deprecated

Deprecated Functions in arkhe
assert_constant

Check Numeric Trend
append

Convert Row Names to an Explicit Column
assert_data

Check Data
assert_names

Check Object Names
arkhe-package

arkhe: Tools for Cleaning Rectangular Data