
Last chance! 50% off unlimited learning
Sale ends in
It extracts terms from a text field (abstract, title, author's keywords, etc.) of a bibliographic data frame.
termExtraction(
M,
Field = "TI",
stemming = FALSE,
language = "english",
remove.numbers = TRUE,
remove.terms = NULL,
keep.terms = NULL,
synonyms = NULL,
verbose = TRUE
)
is a data frame obtained by the converting function convert2df
.
It is a data matrix with cases corresponding to articles and variables to Field Tag in the original WoS or SCOPUS file.
is a character object. It indicates the field tag of textual data :
"TI" |
Manuscript title | |
"AB" |
Manuscript abstract | |
"ID" |
Manuscript keywords plus |
The default is Field = "TI"
.
is logical. If TRUE the Porter Stemming algorithm is applied to all extracted terms. The default is stemming = FALSE
.
is a character. It is the language of textual contents ("english", "german","italian","french","spanish"). The default is language="english"
.
is logical. If TRUE all numbers are deleted from the documents before term extraction. The default is remove.numbers = TRUE
.
is a character vector. It contains a list of additional terms to delete from the documents before term extraction. The default is remove.terms = NULL
.
is a character vector. It contains a list of compound words "formed by two or more terms" to keep in their original form in the term extraction process. The default is keep.terms = NULL
.
is a character vector. Each element contains a list of synonyms, separated by ";", that will be merged into a single term (the first word contained in the vector element). The default is synonyms = NULL
.
is logical. If TRUE the function prints the most frequent terms extracted from documents. The default is verbose=TRUE
.
the bibliometric data frame with a new column containing terms about the field tag indicated in the argument Field
.
convert2df
to import and convert an WoS or SCOPUS Export file in a bibliographic data frame.
biblioAnalysis
function for bibliometric analysis
# NOT RUN {
# Example 1: Term extraction from titles
data(scientometrics)
# vector of compound words
keep.terms <- c("co-citation analysis","bibliographic coupling")
# term extraction
scientometrics <- termExtraction(scientometrics, Field = "TI",
remove.numbers=TRUE, remove.terms=NULL, keep.terms=keep.terms, verbose=TRUE)
# terms extracted from the first 10 titles
scientometrics$TI_TM[1:10]
#Example 2: Term extraction from abstracts
data(scientometrics)
# vector of terms to remove
remove.terms=c("analysis","bibliographic")
# term extraction
scientometrics <- termExtraction(scientometrics, Field = "AB", stemming=TRUE,language="english",
remove.numbers=TRUE, remove.terms=remove.terms, keep.terms=NULL, verbose=TRUE)
# terms extracted from the first abstract
scientometrics$AB_TM[1]
# Example 3: Term extraction from keywords with synonyms
data(scientometrics)
# vector of synonyms
synonyms <- c("citation; citation analysis", "h-index; index; impact factor")
# term extraction
scientometrics <- termExtraction(scientometrics, Field = "ID",
synonyms=synonyms, verbose=TRUE)
# }
Run the code above in your browser using DataLab