CITAN (version 2014.12-1)

lbsFindDuplicateAuthors: Find groups of authors to be merged (**EXPERIMENTAL**)

Description

Indicates, by finding similarities between authors' names, groups of authors that possibly should be merged.

Usage

lbsFindDuplicateAuthors(conn, names.like = NULL, ignoreWords = c("van",
  "von", "der", "no", "author", "name", "available"), minWordLength = 4,
  orderResultsBy = c("citations", "ndocuments", "name"), aggressiveness = 0)

Arguments

conn
connection object, see lbsConnect.
names.like
character vector of SQL-LIKE patterns that allow for restricting the search procedure to only given authors' names.
ignoreWords
character vector; words to be ignored.
minWordLength
numeric; minimal word length to be considered.
orderResultsBy
determines results' presentation order; one of citations, ndocuments name.
aggressiveness
nonnegative integer; controls the search depth.

Value

  • List of authors' identifiers to be merged. The first element of each vector is the one marked by the user as Parent, and the rest are the Children.

Details

The function uses a heuristic **EXPERIMENTAL** algorithm. Its behavior is controlled by the aggressiveness parameter.

Search results are presented in a convenient-to-use graphical dialog box. Note that the calculation often takes a few minutes!

The names.like parameter determines search patterns in an SQL LIKE format, i.e. an underscore _ matches a single character and a percent sign % matches any set of characters. The search is case-insensitive.

See Also

lbsMergeAuthors, lbsFindDuplicateTitles, lbsGetInfoAuthors

Examples

Run this code
conn <- lbsConnect("Bibliometrics.db");
## ...
listauth <- lbsFindDuplicateAuthors(conn,
   ignoreWords=c("van", "von", "der", "no", "author", "name", "available"),
   minWordLength=4,
   orderResultsBy=c("citations"),
   aggressiveness=1);
lbsMergeAuthors(conn, listauth);
dbCommit(conn);
## ...

Run the code above in your browser using DataLab