Learn R Programming

exametrika (version 1.6.0)

MutualInformation: Mutual Information

Description

Mutual Information is a measure that represents the degree of interdependence between two items. This function is applicable to both binary and polytomous response data. The measure is calculated using the joint probability distribution of responses between item pairs and their marginal probabilities.

Usage

MutualInformation(U, na = NULL, Z = NULL, w = NULL, base = 2)

# S3 method for default MutualInformation(U, na = NULL, Z = NULL, w = NULL, base = 2)

# S3 method for binary MutualInformation(U, na = NULL, Z = NULL, w = NULL, base = 2)

# S3 method for ordinal MutualInformation(U, na = NULL, Z = NULL, w = NULL, base = 2)

Value

A matrix of mutual information values with exametrika class. Each element (i,j) represents the mutual information between items i and j, measured in bits. Higher values indicate stronger interdependence between items.

Arguments

U

Either an object of class "exametrika" or raw data. When raw data is given, it is converted to the exametrika class with the dataFormat function.

na

Values to be treated as missing values.

Z

Missing indicator matrix of type matrix or data.frame. Values of 1 indicate observed responses, while 0 indicates missing data.

w

Item weight vector specifying the relative importance of each item.

base

The base for the logarithm. Default is 2. For polytomous data, you can use "V" to set the base to min(rows, columns), "e" for natural logarithm (base e), or any other number to use that specific base.

Details

For binary data, the following formula is used: $$ MI_{jk} = p_{00} \log_2 \frac{p_{00}}{(1-p_j)(1-p_k)} + p_{01} \log_2 \frac{p_{01}}{(1-p_j)p_k} + p_{10} \log_2 \frac{p_{10}}{p_j(1-p_k)} + p_{11} \log_2 \frac{p_{11}}{p_jp_k} $$ Where:

  • \(p_{00}\) is the joint probability of incorrect responses to both items j and k

  • \(p_{01}\) is the joint probability of incorrect response to item j and correct to item k

  • \(p_{10}\) is the joint probability of correct response to item j and incorrect to item k

  • \(p_{11}\) is the joint probability of correct responses to both items j and k

For polytomous data, the following formula is used: $$MI_{jk} = \sum_{j=1}^{C_j}\sum_{k=1}^{C_k}p_{jk}\log \frac{p_{jk}}{p_{j.}p_{.k}}$$

The base of the logarithm can be the number of rows, number of columns, min(rows, columns), base-10 logarithm, natural logarithm (e), etc.

Examples

Run this code
# example code
# Calculate Mutual Information using sample dataset J15S500
MutualInformation(J15S500)

Run the code above in your browser using DataLab