Learn R Programming

setweaver (version 1.0.0)

probstat: probstat

Description

Computes marginal, conditional, and information-theoretic summaries for a binary outcome `y` against one or more predictors in `x`. Performs either Fisher's exact test or a generalized linear mixed model (GLMM) for inference.

Usage

probstat(y, x, test = "Fisher", ri, nfolds, seed = 10101)

Value

A data frame with one row per evaluated predictor (or pair) and the following columns:

xprob

Marginal probability of \(X=1\).

yprob

Marginal probability of \(Y=1\).

cprob

Conditional probability \(P(Y=1 \mid X=1)\).

cprobx

Conditional probability \(P(X=1 \mid Y=1)\).

cprobi

Inverse conditional probability \(P(Y=1 \mid X=0)\).

cpdif

Difference \(P(Y=1 \mid X=1) - P(Y=1)\).

cpdifper

Percent difference relative to \(P(Y=1)\).

xent

Entropy of \(X\).

yent

Entropy of \(Y\).

ce

Conditional entropy of \(Y \mid X\).

cedif

Difference between marginal and conditional entropy of \(Y\).

cedifper

Percent difference in entropy.

p

p-value from Fisher's exact test or the GLMM (as applicable).

Arguments

y

A binary outcome vector (logical or numeric coded as 0/1). Length `n`.

x

A data frame of predictors (typically the expanded data returned by [pairmi()]). Must have `n` rows; columns are treated as candidate predictors.

test

Character string selecting the inferential method; one of `c("fisher", "glmm")`. Defaults to `"fisher"` if missing.

ri

Optional vector/factor giving the grouping variable for a random intercept in the GLMM. Must be length `n`. Ignored if `test = "fisher"`.

nfolds

Integer; number of folds used for cross-validation.

seed

Integer seed for fold randomization.

Examples

Run this code
pairmiresult = pairmi(misimdata[,2:6])
probstat(misimdata$y,pairmiresult$expanded.data,nfolds=5)

Run the code above in your browser using DataLab