Transforms time series data of local election results into a set of network data for use in Local Political Actor Network Diachronic Analysis (LPANDA). The function constructs a bipartite network (candidate – candidate list), its projected one-mode networks (candidate – candidate and list – list), a continuity graph (linking candidate lists between adjacent elections), and an elections network (its node attributes can serve as electoral statistics). It also detects parties (as clusters of candidate lists based on community detection applied to the bipartite network) and constructs their network.
prepare_network_data(df, input_variable_map = list(), verbose = TRUE, ...)A list of network data objects for diachronic analysis
using LPANDA or other social network analysis tools. Each component contains
edgelist (data.frame of edges) and node_attr (data.frame of node
attributes). The exact set of columns depends on the input and may evolve. See
Output data structure for a description of the returned object.
A data.frame containing data from elections, with one row per candidate. The function also accepts a single election, though diachronic outputs will then be empty or trivial. See the Expected structure of input data section for the expected data format and required variables.
A list mapping variable names in df
that differ from the expected ones:
elections = unique election identifiers
(numeric),
candidate = candidate's name used as a unique
identifier (character),
list_name = name of the candidate list
(character),
list_pos = candidate's position on the list
(numeric),
pref_votes = preferential votes received by the
candidate (numeric),
list_votes = * total votes received by the candidate
list (numeric),
elected = whether the candidate was elected
(logical),
nom_party = party that nominated the candidate
(character),
pol_affil = declared political affiliation of the candidate
(character),
mayor = whether the councillor became mayor
(logical),
dep_mayor = whether the councillor became deputy mayor
(logical),
board = whether the councillor became a member
of the executive board (logical),
gov_support = whether the councillor supported
the executive body (logical),
elig_voters = * number of eligible voters
(numeric),
ballots_cast = * number of ballots cast
(numeric),
const_size = * size of the constituency (number of seats)
(numeric)
* Variables marked with an asterisk should appear
only once per election and constituency — in the
row of any one candidate running in that specific
elections and constituency.
See the Expected input data structure section
to find out how to use it.
Logical, default TRUE. If FALSE, suppresses informative messages.
Optional arguments reserved for internal development, experimental
features and future extensions, such as include_cores (logical,
default FALSE). Not intended for standard use yet (behavior may
change without notice). Unknown keys in ... are ignored.
The input data frame (df) must include at least the election identifiers
(year[.month]), candidates' names (uniquely identifying individuals), and
list names. Other variables are optional. If variable names in the dataset
differ from the expected ones, they should be specified in the input_variable_map
as a named list (only differing names need to be listed).
Just in case - a named list is a list where each element has a name (the
expected variable name) and a value (the actual name used in your data frame),
for example: list(list_name = "party", elected = "seat", list_votes = "votes_total").
Examples of expected and acceptable values in df:
elections (required): Election identifier in the format YY[YY][.MM]:
e.g., 94 | 02 | 1998 | "2024" | 2022.11
candidate (required): e.g., "John Doe" | "John Smith (5)" | "Jane Doe, jr."
list_name (required): for independent candidates, you can use:
e.g., "John Smith, Independent Candidate" | "J.S., IND."
list_pos, pref_votes, list_votes: must be numeric
elected, mayor, dep_mayor, board, gov_support: 1 | "0" | T | "F" | "TRUE" | FALSE
(non‑logical inputs will be coerced to logical).
nom_party: for independent candidates, you can use: "IND" | "Independent Candidate"
pol_affil: for independent candidates, you can use: "non-partisan"
elig_voters, ballots_cast, const_size: A numeric that
should appear only once in any candidate row within a given election and constituency
If pref_votes are present but list_votes are not, the function assumes
a voting system where list votes are calculated by summing the preferential
votes of candidates on the list.
If const_size is missing, it will be estimated based on the number of
elected candidates (if available).
For the purposes of analysis, a new variable list_id (class character)
is added to the internally processed copy of df and carried to the output.
It uniquely identifies each candidate list in a given election (combining
list_name and elections), e.g., Besti Flokkurinn (2010), SNP (2019),
or "John Smith (5), IND. (2022.11)". This variable serves as a key identifier
in LPANDA for tracking candidate lists across elections and constructing
network relations.
The returned object is a named list with up to seven network objects:
bipartite: bipartite network (candidates-lists).
candidates: projected candidate–candidate network.
lists: projected list–list network (directed by election order).
continuity: filtered version of lists network (edges of adjacent elections only).
parties: network of detected party clusters (via community detection applied
on bipartite network).
(cores): higher-level clusters of parties. Cores are currently experimental
and will not appear in the standard output network data. See Note.
elections: inter-election candidate flow and election-level stats
Each object is a list with two components:
edgelist: a data.frame representing network edges
node_attr: a data.frame with attributes for each node
For example, ...$candidates$edgelist contains edges between individuals
who appeared on the same candidate list, and ...$elections$node_attr
includes several election statistics (e.g., number of candidates, distributed
seats, plurality index, voter turnout for each election, etc.).
data(sample_different_varnames, package = "lpanda")
df <- sample_different_varnames
str(df) # different variable names: "party" and "seat"
input_variable_map <- list(list_name = "party", elected = "seat")
# \donttest{
netdata <- prepare_network_data(df, input_variable_map, verbose = FALSE)
str(netdata, vec.len = 1)
# }
Run the code above in your browser using DataLab