Learn R Programming

revtools (version 0.2.2)

find_duplicates: Locate duplicated references within a data.frame

Description

Identify potential duplicates within a data.frame containing title, journal and year data for each reference. Such a data.frame can be created by calling as.data.frame on an object of class bibliography (e.g. as returned by read_bibliography()).

Usage

find_duplicates(x)

Arguments

x

a data.frame containing title, journal and year data for each reference

Value

a data.frame with the same columns as the initial data, with a numeric variable named 'group'; rows with the same value are probable duplicates.

Examples

Run this code
# NOT RUN {
# import data
file_location<-system.file("extdata", "avian_ecology_bibliography.ris", package="revtools")
x<-as.data.frame(read_bibliography(file_location)) 

# generate then locate some 'fake' duplicates
x_duplicated<-rbind(x, x[1:5,])
x_check<-find_duplicates(x_duplicated)
# returns a data.frame with an added 'group' column
# }

Run the code above in your browser using DataLab