peacesciencer
: Tools and Data for Quantitative Peace Science
peacesciencer
is an R package including various functions and data
sets to allow easier analyses in the field of quantitative peace
science. The goal is to provide an R package that reasonably
approximates what made
EUGene
so attractive to scholars working in the field of quantitative peace
science in the early 2000s. EUGene shined because it encouraged
replications of conflict models while having the user also generate data
from scratch. Likewise, this R package will offer tools to approximate
what EUGene did within the R environment (i.e. not requiring Windows for
installation).
Installation
You can install this on CRAN, as follows:
install.packages("peacesciencer")
You can install the development version of this package through the
devtools
package. The development version of the package invariably
has more goodies, but may or may not be at various levels of
stress-testing.
devtools::install_github("svmiller/peacesciencer")
What’s Included in {peacesciencer}
The package is already well developed and its functionality continues to expand. The current development version has the following functions.
Function | Description |
---|---|
add_archigos() | add_archigos() allows you to add some information about leaders to dyad-year or state-year data. The function leans on an abbreviated version of the data, which also comes in this package. |
add_atop_alliance() | add_atop_alliance() allows you to add Alliance Treaty Obligations and Provisions (ATOP) data to a dyad-year data frame. |
add_capital_distance() | add_capital_distance() allows you to add capital-to-capital distance to a dyad-year or state-year data frame. The capitals are coded in the capitals data frame, along with their latitudes and longitudes. The distance variable that emerges capdist is calculated using the “Vincenty” method (i.e. “as the crow flies”) and is expressed in kilometers. |
add_ccode_to_gw() | add_ccode_to_gw() allows you to match, as well as one can, Correlates of War system membership data with Gleditsch-Ward system data. |
add_contiguity() | add_contiguity() allows you to add Correlates of War contiguity data to a dyad-year or state-year data frame. |
add_cow_alliance() | add_cow_alliance() allows you to add Correlates of War alliance data to a dyad-year data frame |
add_cow_majors() | add_cow_majors() allows you to add Correlates of War major power variables to a dyad-year or state-year data frame. |
add_cow_mids() | add_cow_mids() merges in CoW’s MID data to a dyad-year data frame. The current version of the CoW-MID data is version 5.0. |
add_cow_trade() | add_cow_trade() allows you to add Correlates of War alliance data to a dyad-year data frame |
add_cow_wars() | add_cow_wars() allows you to add UCDP Armed Conflict data to a state-year data frame |
add_creg_fractionalization() | add_creg_fractionalization() allows you to add information about the fractionalization/polarization of a state’s ethnic and religious groups to your dyad-year or state-year data. |
add_democracy() | add_democracy() allows you to add estimates of democracy to either dyad-year or state-year data. |
add_gml_mids() | add_gml_mids() merges in GML’s MID data to a dyad-year data frame. The current version of the GML MID data is 2.1.1. |
add_gwcode_to_cow() | add_gwcode_to_cow() allows you to match, as well as one can, Gleditsch-Ward system membership data with Correlates of War state system membership data. |
add_igos() | add_igos() allows you to add information from the Correlates oF War International Governmental Organizations data to dyad-year or state-year data, matching on Correlates of War system codes. |
add_minimum_distance() | add_minimum_distance() allows you to add the minimum distance (in kilometers) to a dyad-year or state-year data frame. These estimates are recorded in the cow_mindist and gw_mindist data that come with this package. The data are current as of the end of 2015. |
add_nmc() | add_nmc() allows you to add the Correlates of War National Material Capabilities data to dyad-year or state-year data. |
add_peace_years() | add_peace_years() calculates peace years for your ongoing dyadic conflicts. The function works for both the CoW-MID data and the Gibler-Miller-Little (GML) MID data. |
add_rugged_terrain() | add_rugged_terrain() allows you to add information, however crude, about the “ruggedness” of a state’s terrain to your dyad-year or state-year data. |
add_sdp_gdp() | add_sdp_gdp() allows you to add estimated GDP and “surplus” domestic product data from a 2020 analysis published in International Studies Quarterly by Anders, Fariss, and Markowitz. |
add_strategic_rivalries() | add_strategic_rivalries() merges in Thompson and Dreyer’s (2012) strategic rivalry data to a dyad-year data frame. The right-bound, as of right now, are bound at 2010. |
add_ucdp_acd() | add_ucdp_acd() allows you to add UCDP Armed Conflict data to a state-year data frame |
add_ucdp_onsets() | add_ucdp_onsets() allows you to add information about conflict episode onsets from the UCDP data program to state-year data. |
create_dyadyears() | create_dyadyears() allows you to dyad-year data from either the Correlates of War (CoW) state system membership data or the Gleditsch-Ward (gw) system membership data. The function leans on internal data provided in the package. |
create_statedays() | create_statedays() allows you to create state-day data from either the Correlates of War (CoW) state system membership data or the Gleditsch-Ward (gw) system membership data. The function leans on internal data provided in the package. |
create_stateyears() | create_stateyears() allows you to generate state-year data from either the Correlates of War (CoW) state system membership data or the Gleditsch-Ward (gw) system membership data. The function leans on internal data provided in the package. |
filter_prd() | filter_prd() filters a dyad-year data frame to just those that are “politically relevant.” This is useful for discarding unnecessary (and unwanted) observations that just consume space in memory. |
ps_cite() | ps_cite() allows the user to get citations to scholarship that they should include in their papers that incorporate the functions and data in this package. |
The current development version also includes the following data.
Object Name | Description |
---|---|
archigos | Archigos: A (Subset of a) Dataset on Political Leaders |
atop_alliance | Alliance Treaty Obligations and Provisions (ATOP) Project Data (v. 5.0) |
capitals | A complete list of capitals and capital transitions for Correlates of War state system members |
ccode_democracy | Democracy data for all Correlates of War states |
cow_alliance | Correlates of War directed dyad-year alliance data |
cow_contdir | Correlates of War Direct Contiguity Data (v. 3.2) |
cow_ddy | A directed dyad-year data frame of Correlates of War state system members |
cow_gw_years | Correlates of War and Gleditsch-Ward states, by year |
cow_igo_ndy | Correlates of War Non-Directed Dyad-Year International Governmental Organizations (IGOs) Data |
cow_igo_sy | Correlates of War State-Year International Governmental Organizations (IGOs) Data |
cow_majors | Correlates of War Major Powers Data (1816-2016) |
cow_mid_ddydisps | Directed Dyadic Dispute-Year Data with No Duplicate Dyad-Years (CoW-MID, v. 5.0) |
cow_mid_dirdisps | Directed Dyadic Dispute-Year Data (CoW-MID, v. 5.0) |
cow_mid_disps | Abbreviate CoW-MID Dispute-level Data (v. 5.0) |
cow_mindist | The Minimum Distance Between States in the Correlates of War System, 1946-2015 |
cow_nmc | Correlates of War National Military Capabilities Data |
cow_sdp_gdp | (Surplus and Gross) Domestic Product for Correlates of War States |
cow_states | Correlates of War State System Membership Data (1816-2016) |
cow_trade_ndy | Correlates of War Dyadic Trade Data Set (v. 4.0) |
cow_trade_sy | Correlates of War National Trade Data Set (v. 4.0) |
cow_war_inter | Correlates of War Inter-State War Data (v. 4.0) |
cow_war_intra | Correlates of War Intra-State War Data (v. 4.1) |
creg | Composition of Religious and Ethnic Groups (CREG) Fractionalization/Polarization Estimates |
gml_dirdisp | Directed dispute-year data (Gibler, Miller, and Little, 2016) |
gml_mid_ddydisps | Directed Dyadic Dispute-Year Data with No Duplicate Dyad-Years (CoW-MID, v. 5.0) |
gw_cow_years | Gleditsch-Ward states and Correlates of War, by year |
gw_ddy | A directed dyad-year data frame of Gleditsch-Ward state system members |
gw_mindist | The Minimum Distance Between States in the Gleditsch-Ward System, 1946-2015 |
gw_sdp_gdp | (Surplus and Gross) Domestic Product for Gleditsch-Ward States |
gw_states | Gleditsch-Ward (Independent States) System Membership Data (1816-2017) |
gwcode_democracy | Democracy data for all Gleditsch-Ward states |
hief | Historical Index of Ethnic Fractionalization data |
maoz_powers | Zeev Maoz’ Regional/Global Power Data |
ps_bib | A ‘BibTeX’ Data Frame of Citations |
rugged | Rugged/Mountainous Terrain Data |
td_rivalries | Thompson and Dreyer’s (2012) Strategic Rivalries, 1494-2010 |
ucdp_acd | UCDP Armed Conflict Data (ACD) (v. 20.1) |
ucdp_onsets | UCDP Onset Data (v. 19.1) |
How to Use {peacesciencer}
{peacesciencer}
has a user’s
guide that is worth reading.
The workflow is going to look something like this. This is a
“tidy”-friendly approach to a data-generating process in quantitative
peace science.
First, start with one of two processes to create either dyad-year or
state-year data. The dyad-year data are created with the
create_dyadyears()
function. It has a few optional parameters with
hidden defaults. The user can specify what kind of state system
(system
) data they want to use—either Correlates of War ("cow"
) or
Gleditsch-Ward ("gw"
), whether they want to extend the data to the
most recently concluded calendar year (mry
) (i.e. Correlates of War
state system membership data are current as of Dec. 31, 2016 and the
script can extend that to the end of the most recently concluded
calendar year), and whether the user wants directed or non-directed
dyad-year data (directed
).
The create_stateyears()
works much the same way, though “directed” and
“non-directed” make no sense in the state-year context. Both functions
default to Correlates of War state system membership data to the most
recently concluded calendar year.
Thereafter, the user can specify what additional variables they want added to these dyad-year or state-year data. Do note: the additional functions lean primarily on Correlates of War state code identifiers. Indeed, the bulk of the quantitative peace science data ecosystem is built around the Correlates of War project. The variables the user wants are added in a “pipe” in a process like this. Do note that the user may want to break up the data-generating process into a few manageable “chunks” (e.g. first generating dyad-year data and saving to an object, adding to it piece by piece).
All told, the process will look something like this. Assume you want to
create some data for something analogous to a “dangerous dyads” design
for all non-directed dyad-years. Here’s how you’d do it in
{peacesciencer}
, which is going to be lifted from the source R scripts
for the user’s guide. The first part of this code chunk will lean on
core {peacesciencer}
functionality whereas the other stuff is some
post-processing and, as a bonus, some modeling.
# library(tidyverse) # load this first for most/all things
# library(peacesciencer) # the package of interest
# library(stevemisc) # a dependency, but also used for standardizing variables for better interpretation
library(tictoc)
tic()
create_dyadyears(directed = FALSE, mry = FALSE) %>%
filter_prd() %>%
add_gml_mids(keep = NULL) %>%
add_peace_years() %>%
add_nmc() %>%
add_democracy() %>%
add_cow_alliance() %>%
add_sdp_gdp() -> Data
Data %>%
mutate(landcontig = ifelse(conttype == 1, 1, 0)) %>%
mutate(cowmajdyad = ifelse(cowmaj1 == 1 | cowmaj2 == 1, 1, 0)) %>%
# Create estimate of militarization as milper/tpop
# Then make a weak-link
mutate(milit1 = milper1/tpop1,
milit2 = milper2/tpop2,
minmilit = ifelse(milit1 > milit2,
milit2, milit1)) %>%
# create CINC proportion (lower over higher)
mutate(cincprop = ifelse(cinc1 > cinc2,
cinc2/cinc1, cinc1/cinc2)) %>%
# create weak-link specification using Quick UDS data
mutate(mindemest = ifelse(xm_qudsest1 > xm_qudsest2,
xm_qudsest2, xm_qudsest1)) %>%
# Create "weak-link" measure of jointly advanced economies
mutate(minwbgdppc = ifelse(wbgdppc2011est1 > wbgdppc2011est2,
wbgdppc2011est2, wbgdppc2011est1)) -> Data
# r2sd() is in {stevemisc}, a {peacesciencer} dependency.
# This is just for a more readable regression output.
Data %>%
mutate_at(vars("cincprop", "mindemest", "minwbgdppc", "minmilit"),
~r2sd(.)) -> Data
broom::tidy(modDD <- glm(gmlmidonset ~ landcontig + cincprop + cowmajdyad + cow_defense +
mindemest + minwbgdppc + minmilit +
gmlmidspell + I(gmlmidspell^2) + I(gmlmidspell^3), data= Data,
family=binomial(link="logit")))
#> # A tibble: 11 x 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) -3.04 0.0634 -47.9 0
#> 2 landcontig 1.05 0.0568 18.5 1.26e- 76
#> 3 cincprop 0.446 0.0363 12.3 9.89e- 35
#> 4 cowmajdyad 0.141 0.0575 2.45 1.41e- 2
#> 5 cow_defense -0.0993 0.0576 -1.72 8.50e- 2
#> 6 mindemest -0.492 0.0524 -9.38 6.55e- 21
#> 7 minwbgdppc 0.283 0.0509 5.56 2.77e- 8
#> 8 minmilit 0.261 0.0231 11.3 1.33e- 29
#> 9 gmlmidspell -0.147 0.00507 -29.1 2.51e-186
#> 10 I(gmlmidspell^2) 0.00249 0.000135 18.4 2.05e- 75
#> 11 I(gmlmidspell^3) -0.0000116 0.000000895 -13.0 1.22e- 38
toc()
#> 11.511 sec elapsed
Here is how you might do a standard civil conflict analysis using Gleditsch-Ward states and UCDP conflict data.
tic()
create_stateyears(system = 'gw') %>%
filter(year %in% c(1946:2019)) %>%
add_ucdp_acd(type=c("intrastate"), only_wars = FALSE) %>%
add_peace_years() %>%
add_democracy() %>%
add_creg_fractionalization() %>%
add_sdp_gdp() %>%
add_rugged_terrain() -> Data
create_stateyears(system = 'gw') %>%
filter(year %in% c(1946:2019)) %>%
add_ucdp_acd(type=c("intrastate"), only_wars = TRUE) %>%
add_peace_years() %>%
rename_at(vars(ucdpongoing:ucdpspell), ~paste0("war_", .)) %>%
left_join(Data, .) -> Data
Data %>%
arrange(gwcode, year) %>%
group_by(gwcode) %>%
mutate_at(vars("xm_qudsest", "wbgdppc2011est",
"wbpopest"), list(l1 = ~lag(., 1))) %>%
rename_at(vars(contains("_l1")),
~paste("l1", gsub("_l1", "", .), sep = "_") ) -> Data
modCW <- list()
broom::tidy(modCW$"All UCDP Conflicts" <- glm(ucdponset ~ l1_wbgdppc2011est + l1_wbpopest +
l1_xm_qudsest + I(l1_xm_qudsest^2) +
newlmtnest + ethfrac + relfrac +
ucdpspell + I(ucdpspell^2) + I(ucdpspell^3), data=subset(Data),
family = binomial(link="logit")))
#> # A tibble: 11 x 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) -5.10 1.35 -3.77 0.000161
#> 2 l1_wbgdppc2011est -0.285 0.110 -2.59 0.00952
#> 3 l1_wbpopest 0.229 0.0672 3.41 0.000645
#> 4 l1_xm_qudsest 0.257 0.181 1.43 0.154
#> 5 I(l1_xm_qudsest^2) -0.726 0.211 -3.44 0.000574
#> 6 newlmtnest 0.0549 0.0666 0.824 0.410
#> 7 ethfrac 0.442 0.358 1.23 0.217
#> 8 relfrac -0.389 0.402 -0.969 0.333
#> 9 ucdpspell -0.0738 0.0393 -1.88 0.0601
#> 10 I(ucdpspell^2) 0.00443 0.00205 2.16 0.0304
#> 11 I(ucdpspell^3) -0.0000602 0.0000280 -2.15 0.0316
broom::tidy(modCW$"Wars Only" <- glm(war_ucdponset ~ l1_wbgdppc2011est + l1_wbpopest +
l1_xm_qudsest + I(l1_xm_qudsest^2) +
newlmtnest + ethfrac + relfrac +
war_ucdpspell + I(war_ucdpspell^2) + I(war_ucdpspell^3), data=subset(Data),
family = binomial(link="logit")))
#> # A tibble: 11 x 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) -6.59 2.08 -3.16 0.00157
#> 2 l1_wbgdppc2011est -0.343 0.172 -1.99 0.0463
#> 3 l1_wbpopest 0.272 0.106 2.56 0.0105
#> 4 l1_xm_qudsest -0.0846 0.270 -0.313 0.754
#> 5 I(l1_xm_qudsest^2) -0.761 0.352 -2.16 0.0307
#> 6 newlmtnest 0.342 0.112 3.05 0.00226
#> 7 ethfrac 0.333 0.554 0.601 0.548
#> 8 relfrac -0.281 0.593 -0.474 0.635
#> 9 war_ucdpspell -0.111 0.0562 -1.98 0.0478
#> 10 I(war_ucdpspell^2) 0.00466 0.00252 1.85 0.0643
#> 11 I(war_ucdpspell^3) -0.0000499 0.0000302 -1.65 0.0982
toc()
#> 3.892 sec elapsed
Citing What You Do in {peacesciencer}
You can (and should) cite what you do in {peacesciencer}
. The package
includes a data frame of a BibTeX
file (ps_bib
) and a function for
finding and returning BibTeX
entries that you can include in your
projects. This is the ps_cite()
function. The ps_cite()
function
takes a string and does a partial match for relevant keywords (as
KEYWORDS
) associated with entries in the ps_bib
file. For example,
you can (and should) cite the package itself.
ps_cite("peacesciencer")
#> @Manual{peacesciencer-package,
#> Author = {Steven V. Miller},
#> Title = {peacesciencer}: A User's Guide for Quantitative Peace Science in R},
#> Year = {2021},
#> Keywords = {peacesciencer, add_capital_distance(), add_ccode_to_gw(), add_gwcode_to_cow(), capitals},
#> Url = {http://svmiller.com/peacesciencer/}
#> }
You can see what are the relevant citations to consider using for the
data returned by add_democracy()
ps_cite("add_democracy()")
#> @Unpublished{coppedgeetal2020vdem,
#> Author = {Michael Coppedge and John Gerring and Carl Henrik Knutsen and Staffan I. Lindberg and Jan Teorell and David Altman and Michael Bernhard and M. Steven Fish and Adam Glynn and Allen Hicken and Anna Luhrmann and Kyle L. Marquardt and Kelly McMann and Pamela Paxton and Daniel Pemstein and Brigitte Seim and Rachel Sigman and Svend-Erik Skaaning and Jeffrey Staton and Agnes Cornell and Lisa Gastaldi and Haakon Gjerl{\o}w and Valeriya Mechkova and Johannes von R{\"o}mer and Aksel Sundtr{\"o}m and Eitan Tzelgov and Luca Uberti and Yi-ting Wang and Tore Wig and Daniel Ziblatt},
#> Note = {Varieties of Democracy ({V}-{D}em) Project},
#> Title = {V-Dem Codebook v10},
#> Year = {2020},
#> Keywords = {add_democracy(), v-dem, varieties of democracy}
#> }
#>
#>
#> @Unpublished{marshalletal2017p,
#> Author = {Monty G. Marshall and Ted Robert Gurr and Keith Jaggers},
#> Note = {University of Maryland, Center for International Development and Conflict Management},
#> Title = {Polity {IV} Project: Political Regime Characteristics and Transitions, 1800-2016},
#> Year = {2017},
#> Keywords = {add_democracy(), polity}
#> }
#>
#>
#> @Unpublished{marquez2016qme,
#> Author = {Xavier Marquez},
#> Note = {Available at SSRN: http://ssrn.com/abstract=2753830},
#> Title = {A Quick Method for Extending the {U}nified {D}emocracy {S}cores},
#> Year = {2016},
#> Keywords = {add_democracy(), UDS, Unified Democracy Scores},
#> Url = {http://dx.doi.org/10.2139/ssrn.2753830}
#> }
#>
#>
#> @Article{pemsteinetal2010dc,
#> Author = {Pemstein, Daniel and Stephen A. Meserve and James Melton},
#> Journal = {Political Analysis},
#> Number = {4},
#> Pages = {426--449},
#> Title = {Democratic Compromise: A Latent Variable Analysis of Ten Measures of Regime Type},
#> Volume = {18},
#> Year = {2010},
#> Keywords = {add_democracy(), UDS, Unified Democracy Scores},
#> Owner = {steve},
#> Timestamp = {2011.01.30}
#> }
You can also return partial matches to see what citations are associated with, say, alliance data in this package.
ps_cite("alliance")
#> @Article{leedsetal2002atop,
#> Author = {Bretty Ashley Leeds and Jeffrey M. Ritter and Sara McLaughlin Mitchell and Andrew G. Long},
#> Journal = {International Interactions},
#> Pages = {237--260},
#> Title = {Alliance Treaty Obligations and Provisions, 1815-1944},
#> Volume = {28},
#> Year = {2002},
#> Keywords = {add_atop_alliance()}
#> }
#>
#>
#> @Book{gibler2009ima,
#> Author = {Douglas M. Gibler},
#> Publisher = {Washington DC: CQ Press},
#> Title = {International Military Alliances, 1648-2008},
#> Year = {2009},
#> Keywords = {add_cow_alliance()}
#> }
This function might expand in complexity in future releases, but you can
use it right now for finding appropriate citations. You an also scan the
ps_bib
data to see what is in there.
Issues/Requests
{peacesciencer}
is already more than capable to meet a wide variety of
needs in the peace science community. Users are free to raise an issue
on the project’s Github if some feature is not performing as they think
it should or if there are additions they would like to see.