retention: Compute the retention probability of a csQCA solution

Description

This function computes the retention probability for a csQCA solution, under various perturbation scenarios. It only works with bivalent crisp-set data, containing the binary values 0 or 1.

Usage

retention(data, outcome = "", conditions = "", type = "corruption", dep = TRUE, n.cut = 1, incl.cut = 1, p.pert = 0.5, n.pert = 1)

Arguments

data

A dataset of bivalent crisp-set factors.

outcome

The name of the outcome.

conditions

A string containing the condition variables' names, separated by commas.

type

Simulate corruptions of values in the conditions, or cases deleted entirely.

dep

Logical, if TRUE indicating DPA - Dependent Perturbations Assumption and if FALSE indicating IPA - Independent Perturbations Assumption.

n.cut

The minimum number of cases for a causal combination with a set membership score above 0.5, for an output function value of "0" or "1".

incl.cut

The minimum sufficiency inclusion score for an output function value of "1".

p.pert

Probability of perturbation under independent (IPA) assumption.

n.pert

Number of perturbations under dependent (DPA) assumption.

Details

The argument data requires a suitable data set, in the form of a data frame. with the following structure: values of 0 and 1 for bivalent crisp-set variables.

The argument outcome specifies the outcome to be explained, in upper-case notation (e.g. X).

The argument conditions specifies the names of the condition variables. If omitted, all variables in data are used except outcome.

The argument type controls which type of perturbations should be simulated to calculate the retention probability. When type = "corruption", it simulates changes of values in the conditions (values of 0 become 1, and values of 1 become 0). When type = "deletion", it calculates the probability of retaining the same solution if a number of cases are deleted from the original data.

The argument dep is a logical which choses between two categories of assumptions. If dep = TRUE (the default) it indicates DPA - Dependent Perturbations Assumption, when perturbations depend on each other and are tied to a fixed number of cases, ex-ante (see Thiem, Spohel and Dusa, 2016). If dep = FALSE, it indicates IPA - Independent Perturbations Assumption, when perturbations are assumed to occur independently of each other.

The argument n.cut is one of the factors that decide which configurations are coded as logical remainders or not, in conjunction with argument incl.cut. Those configurations that contain fewer than n.cut cases with membership scores above 0.5 are coded as logical remainders (OUT = "?"). If the number of such cases is at least n.cut, configurations with an inclusion score of at least incl.cut are coded positive (OUT = "1"), while configurations with an inclusion score below incl.cut are coded negative (OUT = "0").

The argument p.pert specifies the probability of perturbation under the type = "IPA" independent perturbations assumption.

The argument n.pert specifies the number of perturbations under the type = "DPA" dependent perturbations assumption. At least one perturbation is needed to possibly change a csQCA solution, otherwise the solution will remain the same (retention equal to 100%) if zero perturbations occur under this argument.

References

Thiem, A.; Spoehel, R.; Dusa, A. (2015) “Replication Package for: Enhancing Sensitivity Diagnostics for Qualitative Comparative Analysis: A Combinatorial Approach”, Harvard Dataverse, V1. DOI: http://dx.doi.org/10.7910/DVN/QE27H9

Thiem, A.; Spoehel, R.; Dusa, A. (2016) “Enhancing Sensitivity Diagnostics for Qualitative Comparative Analysis: A Combinatorial Approach.” Political Analysis vol.24, no.1, pp.104-120.

Examples

Run this code

if (require("QCA")) {

# the replication data, see Thiem, Spohel and Dusa (2015)

dat <- data.frame(matrix(c(
    rep(1,25), rep(0,20), rep(c(0,0,1,0,0),3),
    0,0,0,1,0,0,1,0,0,0,0, rep(1,7),0,1),
    nrow = 16, byrow = TRUE, dimnames = list(c(
    "AT","DK","FI","NO","SE","AU","CA","FR",
    "US","DE","NL","CH","JP","NZ","IE","BE"),
    c("P", "U", "C", "S", "W"))
))


# calculate the retention probability, for 2.5% probability of data corruption
# under the IPA - independent perturbation assuption
retention(dat, outcome = "W", type = "corruption", dep = FALSE,
       p.pert = 0.025, incl.cut = 1)

# the probability that a csQCA solution will change
1 - retention(dat, outcome = "W", type = "corruption", dep = FALSE,
       p.pert = 0.025, incl.cut = 1)

}

Run the code above in your browser using DataLab