Learn R Programming

contentanalysis (version 0.2.1)

create_citation_network: Create Citation Co-occurrence Network

Description

Creates an interactive network visualization of citation co-occurrences within a document. Citations that appear close to each other are connected, with the strength of the connection based on their distance (in characters). Nodes are colored by the document section where citations primarily appear.

Usage

create_citation_network(
  citation_analysis_results,
  max_distance = 1000,
  min_connections = 1,
  show_labels = TRUE
)

Value

A visNetwork object representing the interactive citation network, or NULL if no valid network can be created. The returned object has an additional stats

attribute containing:

  • n_nodes: Number of nodes in the network

  • n_edges: Number of edges in the network

  • avg_distance: Average distance between connected citations

  • max_distance: Maximum distance parameter used

  • section_distribution: Distribution of citations across sections

  • multi_section_citations: Citations appearing in multiple sections

  • section_colors: Color mapping for sections

Arguments

citation_analysis_results

A list object returned by citation analysis functions, containing at least two elements:

  • network_data: A data frame with columns citation1, citation2, and distance representing pairs of co-occurring citations

  • citations: A data frame with columns citation_text_clean and section containing citation text and section information

  • section_colors: A named vector of colors for each section

max_distance

Numeric. Maximum distance (in characters) between citations to be considered connected. Default is 1000.

min_connections

Integer. Minimum number of connections a citation must have to be included in the network. Default is 1.

show_labels

Logical. Whether to show citation labels on the network nodes. Default is TRUE.

Details

The function creates a network where:

  • Nodes represent unique citations

  • Node size is proportional to the number of connections

  • Node color indicates the primary section where the citation appears

  • Node border is thicker (3px) for citations appearing in multiple sections

  • Edges connect citations that co-occur within the specified distance

  • Edge width decreases with distance (closer citations = thicker edges)

  • Edge color indicates distance: red (<=300 chars), blue (<=600 chars), gray (>600 chars)

The network uses the Fruchterman-Reingold layout algorithm for optimal node positioning. Interactive features include zooming, panning, node dragging, and highlighting of nearest neighbors on hover.

Examples

Run this code
if (FALSE) {
# Assuming you have citation_analysis_results from a previous analysis
network <- create_citation_network(
  citation_analysis_results,
  max_distance = 800,
  min_connections = 2,
  show_labels = TRUE
)

# Display the network
network

# Access network statistics
stats <- attr(network, "stats")
print(stats$n_nodes)
print(stats$section_distribution)
}

Run the code above in your browser using DataLab