Unlimited learning, half price | 50% off
Get 50% off unlimited learning

vvauditor

The vvauditor package is an R package designed to validate the integrity of data by providing a comprehensive set of assertion tests. These tests allow users to check for specific conditions or properties within a dataset, ensuring the accuracy and reliability of the data. The package aims to facilitate quality control checks in data analysis workflows and assists in identifying and correcting any errors or inconsistencies in the data.

Installation

To install the vvauditor package, you can use the install.packages() function in R:


install.packages("vvauditor")

Usage

Once the package is installed, you can load it into your R session using the library() function:

library(vvauditor)

Assertion Tests

The vvauditor package provides a range of assertion tests that can be applied to datasets. These tests include checks for missing values, outliers, distributional assumptions, and more. By incorporating these tests into your data analysis workflow, you can ensure the quality and reliability of your data.

For more information on the available assertion tests and how to use them, please refer to the package documentation.

Contributing

If you're interested in contributing to the vvauditor package, you can follow these steps:

  1. Fork the vvauditor repository on GitHub.
  2. Make your desired changes to the code.
  3. Submit a pull request to the main repository.

Please refer to the contribution guidelines in the package repository for more information on how to contribute.

Copy Link

Version

Install

install.packages('vvauditor')

Monthly Downloads

12,601

Version

0.7.0

License

MIT + file LICENSE

Maintainer

Tomer Iwan

Last Published

February 10th, 2025

Functions in vvauditor (0.7.0)

create_numeric_details

Create numeric details csv
create_dataset_summary_table

Create dataset summary statistics table
create_subset_fields

Create subset fields
check_numeric_or_integer_type

Check for Numeric or Integer Type
regex_content_parameter

Construct Regex for Matching Function Parameter Content
check_non_zero_rows

Check for Non-Zero Rows
find_maximum_value

Find the maximum numeric value in a vector, ignoring non-numeric values
md_complete_cases

MD complete cases
find_common_columns

Find Common Columns Between Data Frames
retrieve_function_calls

retrieve_function_calls
get_distribution_statistics

Compute distribution statistics for a numeric vector
remove_duplicates_and_na

Remove Duplicates and NA Values from Input
get_first_element_class

Retrieve the class of the first element of a vector
get_values

Get values of column
check_zero_columns

Check for Columns with Only 0s
count_more_than_1

Count more than 1
regex_year_date

Generate regular expression of a year date.
identify_join_pairs

Identify Possible Join Pairs Between Data Frames
regex_time

Generate regular expression of a time.
create_field_info

Create field info
find_pattern_r

Find pattern in R scripts
find_minimum_value

Find the minimum numeric value in a vector, ignoring non-numeric values
identify_outliers

Identify Outliers in a Data Frame Column
check_posixct_type

Check for POSIXct Type
run_all_assertions

Run All Data Validation Assertions
str_detect_in_file

Detect string in file
retrieve_functions_and_packages

Retrieve functions and packages
test_all_equal

Test all equal
is_unique_column

Check if a column in a dataframe has unique values
unique_id

unique id
retrieve_package_usage

Retrieve packages that are loaded in a script
return_assertions_message

Return Assertion Messages
return_mtcars_testfile

Read and return the mtcars testfile
check_rows

Check rows
create_data_types

Create data types tibble
create_categorical_details

Create categorical details csv
retrieve_string_assignments

retrieve_string_assignments
retrieve_sourced_scripts

retrieve_sourced_scripts
assert_type_consistency

Assert Type Consistency Between Data and Metadata
assert_logical_named

Assert Logical Value in Column
assert_field_consistency

Check if the fieldnames of the dataset are the same
assert_field_distinctness

Assert Field Uniqueness Consistency Between Data and Metadata
assert_missing_values

Assert Consistency of Missing Values in Data
assert_date_named

Assert Date Value in Column
assertion_message

Assert Message Based on Type
assert_field_existence

Assert Field Existence in New Data
assert_range_validation

Assert Range Validation for Data Fields
assert_no_duplicates_in_group

Assert No Duplicates in Group
check_duplicates

Check for Duplicate Rows in Selected Columns
check_na_columns

Check for columns with only NA values
calculate_category_percentages

Calculate the percentage of categories in a data vector
drop_na_column_names

Drop NA column names
check_double_columns

check double columns
duplicates_in_column

Duplicates in column
check_no_duplicates_in_group

Check for No Duplicates in Group
check_no_duplicate_rows

Check for No Duplicate Rows