BidimentionalPosetRepresentation: Bidimensional representation of multidimensional ordinal binary data generated by a specific reversed pair of lexicographic linear extensions

Description

Starting from a dataset related to $n$ statistical units, scored against $k$ ordinal 0/1-indicators and partially ordered component-wise into a Boolean lattice $B_k=(\{0,1\}^k,\leq_{cmp})$, it finds the bidimensional data representation generated by a specific reversed pair of lexicographic linear extensions.

Usage

BidimentionalPosetRepresentation(profile, weights, variablesPriority)

Value

a list of 2 elements named LossVAlue and Representation.

LossVAlue real number indicating the value of the global error $L(D^{out}|D^{inp}, p)$ corresponding to the representation induced by the chosen variablesPriority.

Representation a data frame with $m$ values (one value for each observed profile) of 5 variables named profiles, x, y, weights and error. $profile is an integer vector containing the base-10 representation of the $k$-dimensional Boolean vectors representing observed profiles. $x is an integer vector containing the x-coordinates of points representing observed profiles in the bidimensional representation. $y is an integer vector containing the y-coordinates of points representing observed profiles in the bidimensional representation. $weights is a real vector with the frequencies/weights of each observed profile. $error is a real vector with the values of the approximation errors $L(b|D^{inp}, p)$ associated to each observed profile in the bidimensional representation.

Arguments

profile: Boolean matrix of dimension $m\times k$ of the unique $m\leq n$ different observed profiles. Each observed profile is row of profile. Each observed profile is repeated only once in the matrix profile.
weights: real vector of length $m$ with the frequencies/weights of each observed profile. Element of position $j$ in vector weights is the frequency/weight of the profile in row $j$ of profile.
variablesPriority: integer vector of dimension $k$ containing a permutation $i_1,...,i_k$ of $1,...,k$. This vector specifies the criterion to build the reversed pair of lexicographic linear extensions used to approximate $B_k$. The first linear extension is built by ordering profiles first according to their scores on $V_{i_1}$, then to the scores on $V_{i_{2}}$ and so on, until $V_{i_{k}}$; the second linear extension is built by ordering profiles first according to their scores on $V_{i_k}$, then to the scores on $V_{i_{k-1}}$ and so on, until $V_{i_{1}}$.

Examples

Run this code

#SIMULATING OBSERVED BINARY DATA
#number of binary variables
k <- 6
#building observed profiles matrix
profiles <- sapply((0:(2^k-1)) ,function(x){ as.integer(intToBits(x))})
profiles <- t(profiles[1:k, ])
#building the vector of observation frequencies
weights <- sample.int(100, nrow(profiles), replace=TRUE)
#Chosing (at random) a variable priority
vp <- sample.int(k, k, replace=FALSE)
result <- BidimentionalPosetRepresentation(profiles, weights, vp)

Run the code above in your browser using DataLab