This function calculates the coupling strength measure @following @vladutz1984 and @shen2019biblionetwork
from a direct citation data frame. It is a refinement of biblio_coupling()
:
it takes into account the frequency with which a reference shared by two articles has been cited in the whole corpus.
In other words, the most cited references are less important in the links between two articles, than references that have
been rarely cited. To a certain extent, it is similar to the tf-idf measure.
coupling_strength(
dt,
source,
ref,
weight_threshold = 1,
output_in_character = TRUE
)
The data frame with citing and cited documents.
the column name of the source identifiers, that is the documents that are citing.
the column name of the references that are cited.
Corresponds to the value of the non-normalized weights of edges. The function just keeps the edges
that have a non-normalized weight superior to the weight_threshold
. In other words, if you set the
parameter to 2, the function keeps only the edges between nodes that share at least two references
in common in their bibliography. In a large bibliographic coupling network,
you can consider for instance that sharing only one reference is not sufficient/significant for two articles to be linked together.
This parameter could also be modified to avoid creating intractable networks with too many edges.
If TRUE, the function ends by transforming the from
and to
columns in character, to make the
creation of a tidygraph graph easier.
A data.table with the articles identifiers in from
and to
columns, with the coupling strength measure in
another column. It also keeps a copy of from
and to
in the Source
and Target
columns. This is useful is you
are using the tidygraph package then, where from
and to
values are modified when creating a graph.
# NOT RUN {
library(biblionetwork)
coupling_strength(Ref_stagflation,
source = "Citing_ItemID_Ref",
ref = "ItemID_Ref")
# }
Run the code above in your browser using DataLab