Learn R Programming

visdat (version 0.0.4.9999)

vis_compare: compare two dataframes and see where they are different.

Description

vis_compare, like the other vis_* families, gives an at-a-glance ggplot of a dataset, but in this case hones in on visualising **two** different dataframes, which currently need to be exactly the same dimension. This function has not been implemented yet, but this serves as a note of how it might work. It would be very similar to vis_miss, where you basically colouring cells according to "match" and "non-match". The code for this would be pretty crazy simple. `x <- 1:10; y <- c(1:5, 10:14) ;x == y` returns ` [1] TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE` Here black could indicate a match, and white a non match. One of the challenges with this would be cases where the datasets are different dimensions. One option could be to return as a message the columns that are not in the same dataset. Matching rows could be done by row number, and just lopping off the trailing ones and spitting out a note. Then, if the user wants, it could use an ID/key to match by.

Usage

vis_compare(df1, df2)

Arguments

df1
the first dataframe to compare to
df2
the second dataframe to compare to

Examples

Run this code

# make a new dataset of iris that contains some NA values
iris_diff <- iris
iris_diff[1:10, 1:2] <- NA

library(visdat)

vis_compare(iris, iris_diff)

Run the code above in your browser using DataLab