superSubset, findSubsets, findSupersets: Functions to find subsets or supersets

Description

These two functions find the corresponding row numbers in the implicant matrix, for all subsets, or supersets, of a (prime) implicant or an initial causal configuration.

Usage

superSubset(data, outcome = "", conditions = "", relation = "nec", incl.cut = 1, cov.cut = 0, use.tilde = FALSE, use.letters = FALSE, ...)
findSubsets(noflevels, row.no, maximum)
findSupersets(noflevels, input.combs)

Arguments

data

A data frame with crisp (binary and multi-value) or fuzzy causal conditions

outcome

The name of the outcome.

conditions

A string containing the condition variables' names, separated by commas.

relation

The set relation to outcome, either "nec", "suf", "necsuf" or "sufnec".

incl.cut

The minimal inclusion score of the set relation.

cov.cut

The minimal coverage score of the set relation.

use.tilde

Logical, use tilde for negation with bivalent variables.

use.letters

Logical, use simple letters instead of original variable names.

noflevels

A vector containing the number of levels for each causal condition plus 1 (all subsets are located in the higher dimension, implicant matrix)

row.no

The row number(s) where the (prime) implicant(s) are located

maximum

The maximum line number (subset) to be returned

input.combs

A matrix of configurations or a vector of their row numbers in the implicant matrix.

...

Other arguments for backward compatibility.

Value

The result of the superSubset function is an object of class "ss", which is a list with the following components:

incl.cov

A data frame with the parameters of fit.

For findSubsets and findSupersets, a vector with the row numbers corresponding to all possible subsets, or supersets, of a (prime) implicant.

Details

The function superSubset finds a list of implicants that satisfy some restrictions referring to the inclusion and coverage with respect to the outcome, under given assumptions of necessity and/or sufficiency.

Ragin (2000) posits that under the necessity relation, instances of the outcome constitute a subset of the instances of the cause(s). Conversely, under the sufficiency relation, instances of the outcome constitute a superset of the instances of the cause(s).

When relation = "nec" the function finds all implicants which are supersets of the outcome, then eliminates the redundant ones and returns the surviving (minimal) supersets, provided they pass the inclusion and coverage thresholds. If none of the surviving supersets pass these thresholds, the function will find unions of causal conditions, instead of set intersections.

When relation = "suf" finds all implicants which are subsets of the outcome, and similarly eliminates the redundant ones and return the surviving (minimal) subsets.

When relation = "necsuf", the relation is interpreted as necessity, and cov.cut is automatically set equal to the inclusion cutoff incl.cut. The same automatic equality is made for relation = "sufnec", when relation is interpreted as sufficiency.

The argument outcome specifies the name of the outcome, and if multi-value the argument can also specify the level to explain, using curly brackets notation.

Outcomes can be negated using a tilde operator ~X. The logical argument neg.out is now deprecated, but still backwards compatible. Replaced by the tilde in front of the outcome name, it controls whether outcome is to be explained or its negation. If outcome is from a multivalent variable, it has the effect that the disjunction of all remaining values becomes the new outcome to be explained. neg.out = TRUE and a tilde ~ in the outcome name don't cancel each other out, either one (or even both) signaling if the outcome should be negated.

If the argument conditions is not specified, all other columns in data are used.

Along with the standard measures of inclusion and coverage, the function also returns PRI for sufficiency and RoN (relevance of necessity, see Schneider & Wagemann, 2012) for the necessity relation.

A subset is a combination (an intersection) of causal conditions, with respect to a larger (super)set, which is another (but more parsimonious) combination of causal conditions.

All subsets of a given set can be found in the so called “implicant matrix”, which is a $n^k$ space, understood as all possible combinations of values in any combination of bases $n$, each causal condition having three or more levels (Dusa, 2007, 2010).

For binary causal conditions (values 0 and 1), there are three levels in the implicant matrix:

0 to mark a minimized literal

1 to replace the value of 0 in the original binary condition

2 to replace the value of 1 in the original binary condition

A prime implicant is a superset of an initial combination of causal conditions, and the reverse is also true: the initial combination is a subset of a prime implicant.

Any normal implicant (not prime) is a subset of a prime implicant, and in the same time a superset of some initial causal combinations.

Functions findSubsets and findSupersets find:

- all possible such subsets for a given (prime) implicant, or

- all possible supersets of an implicant or initial causal combination

in the implicant matrix.

References

Cebotari, V.; Vink, M.P. (2013) “A Configurational Analysis of Ethnic Protest in Europe”. International Journal of Comparative Sociology vol.54, no.4, pp.298-324.

Cebotari, Victor; Vink, Maarten Peter (2015) Replication Data for: A configurational analysis of ethnic protest in Europe, Harvard Dataverse, V2, DOI: http://dx.doi.org/10.7910/DVN/PT2IB9

Dusa, Adrian (2007) Enhancing Quine-McCluskey. COMPASSS: Working Paper 2007-49. URL: http://www.compasss.org/wpseries/Dusa2007b.pdf.

Dusa, Adrian (2010) “A Mathematical Approach to the Boolean Minimization Problem.” Quality & Quantity vol.44, no.1, pp.99-113.

Lipset, S. M. (1959) “Some Social Requisites of Democracy: Economic Development and Political Legitimacy”, American Political Science Review vol.53, pp.69-105.

Schneider, Carsten Q.; Wagemann, Claudius (2012) Set-Theoretic Methods for the Social Sciences: A Guide to Qualitative Comparative Analysis (QCA). Cambridge: Cambridge University Press.

Examples

Run this code

if (require("QCA")) {

# Lipset binary crisp sets
data(LC)
ssLC <- superSubset(LC, "SURV")

require(venn)
x = list("SURV" = which(LC$SURV == 1),
         "STB" = which(ssLC$coms[, 1] == 1),
         "LIT" = which(ssLC$coms[, 2] == 1))
venn(x, cexil = 0.7)


# Lipset multi-value sets
data(LM)
superSubset(LM, "SURV")


# Cebotari & Vink (2013) fuzzy data
data(CVF)

# all necessary combinations with at least 0.9 inclusion and 0.6 coverage cut-offs
ssCVF <- superSubset(CVF, outcome = "PROTEST", incl.cut = 0.90, cov.cut = 0.6)
ssCVF

# the membership scores for the first minimal combination (GEOCON)
ssCVF$coms$GEOCON

# same restrictions, for the negation of the outcome
superSubset(CVF, outcome = "~PROTEST", incl.cut = 0.90, cov.cut = 0.6)


# to find supersets or supersets, a hypothetical example using
# three binary causal conditions, having two levels each: 0 and 1
noflevels <- c(2, 2, 2)

# second row of the implicant matrix: 0 0 1
# which in the "normal" base is:      - - 0
# the prime implicant being: ~C
(sub <- findSubsets(noflevels + 1, row.no = 2))
#  5  8 11 14 17 20 23 26 


getRow(noflevels + 1, sub)

# implicant matrix   normal values
#      A  B  C    |       A  B  C       
#   5  0  1  1    |    5  -  0  0      bc    
#   8  0  2  1    |    8  -  1  0      Bc
#  11  1  0  1    |   11  0  -  0      ac
#  14  1  1  1    |   14  0  0  0      abc
#  17  1  2  1    |   17  0  1  0      aBc
#  20  2  0  1    |   20  1  -  0      Ac
#  23  2  1  1    |   23  1  0  0      Abc               
#  26  2  2  1    |   26  1  1  0      ABc 


# stopping at maximum row number 20
findSubsets(noflevels + 1, 2, 20)
#  5  8 11 14 17 20


# for supersets
findSupersets(noflevels + 1, 14)
#  2  4  5 10 11 13 14

findSupersets(noflevels + 1, 17)
#  2  7  8 10 11 16 17

# input.combs as a matrix
(input.combs <- getRow(noflevels + 1, c(14, 17)))

# implicant matrix   normal values
#  14  1  1  1    |   14  0  0  0       abc
#  17  1  2  1    |   17  0  1  0       aBc

(sup <- findSupersets(noflevels + 1, input.combs))
#  2  4  5  7  8 10 11 13 14 16 17

getRow(noflevels + 1, sup)

# implicant matrix   normal values
#      A  B  C    |       A  B  C       
#   2  0  0  1    |    2  -  -  0       c      
#   4  0  1  0    |    4  -  0  -       b
#   5  0  1  1    |    5  -  0  0       bc
#   7  0  2  0    |    7  -  1  -       B
#   8  0  2  1    |    8  -  1  0       Bc
#  10  1  0  0    |   10  0  -  -       a  
#  11  1  0  1    |   11  0  -  0       ac                 
#  13  1  1  0    |   13  0  0  -       ab   
#  14  1  1  1    |   14  0  0  0       abc
#  16  1  2  0    |   16  0  1  -       aB
#  17  1  2  1    |   17  0  1  0       aBc
                             
}

Run the code above in your browser using DataLab