Find documents to be merged (**EXPERIMENTAL**)
Indicates, by finding similarities between documents' titles, groups of documents that possibly should be merged.
lbsFindDuplicateTitles(conn, surveyDescription = NULL, ignoreTitles.like = NULL, aggressiveness = 1)
- connection object, see
- character string or
NULL; survey description to restrict to or
- character vector of SQL-LIKE patterns to match documents' titles to be ignored or
- nonnegative integer;
0for showing only exact matches; the higher the value, the more documents will be proposed.
The function determines fuzzy similarity measures of the titles. Its
specificity is controlled by the
Search results are presented in a convenient-to-use graphical dialog box. The function tries to order the groups of documents according to their relevance (**EXPERIMENTAL** algorithm). Note that the calculation often takes a few minutes!
ignoreTitles.like parameter determines search patterns in an SQL
i.e. an underscore
_ matches a single character and a percent sign
% matches any set of characters. The search is case-insensitive.
A numeric vector of user-selected documents' identifiers to be removed.
## Not run: # conn <- lbsConnect("Bibliometrics.db"); # ## ... # listdoc <- lbsFindDuplicateTitles(conn, # ignoreTitles.like=c("\%In this issue\%", "\%Editorial", "\%Introduction", # "Letter to \%", "\%Preface"), # aggressiveness=2); # lbsDeleteDocuments(conn, listdoc); # dbCommit(conn); # ## ...## End(Not run)