Learn R Programming

NominalLogisticBiplot (version 0.2)

NominalLogisticBiplot: Nominal Logistic Biplot for polytomous data

Description

Function that calculates the parameters of the Nominal Logistic Biplot according to Hernandez-Sanchez & Vicente-Villardon (2013).

Usage

NominalLogisticBiplot(datanom, sFormula = NULL, numFactors = 2, method = "EM", rotation = "varimax", metfsco = "EAP", nnodos = 10, tol = 1e-04, maxiter = 100, penalization = 0.1, cte = TRUE,initial=1,alfa=1, show = FALSE)

Arguments

datanom
The data set, it can be a matrix with integers or a data frame with factors. All variables have to be nominal.
sFormula
This parameter follows the unifying interface for selecting variables from a data frame for a plot, test or model. The most common formula es of type y ~ x1+x2+x3. It has a default value of NULL if not specified.
numFactors
Number of dimensions of the solution. It should be lower than the number of variables. It has a default value of 2.
method
This parameter can be: "EM", "ACM", "MIRT" or "PCOA". Method to compute the row coordinates.
rotation
Rotation method to used with "MIRT" option in "coordinates". No effect fror other options.
metfsco
Calculation method for the fscores with "MIRT" option in "coordinates". No effect fror other options.
nnodos
Number of nodes for gauss quadrature in the EM algorithm.
tol
Tolerance for the EM algorithm.
maxiter
Maximum number of iterations in the EM algorithm.
penalization
Penalization for the ridge regression for each variable.
cte
Include constant in the logistic regression model. Default is TRUE.
initial
Value to decide the method(1-Correspondence analysis, 2-Mirt) that calculates the initial abilities values for the individuals.
alfa
If initial parameter method is correspondence analysis, this parameter determines the weight for rows and columns.
show
Show intermediate copmputations. Default is TRUE.

Value

An object of class "nominal.logistic.biplot". This has some components:
dataSet
Data set of study with all the information about the name of the levels and names of the variables and individuals
RowCoords
Coordinates for the individuals in the reduced space
VariableModels
Information of the regression resuls for each variable.
NumFactors
Number of dimensions selected for the study
Method
Method for calculating the row positions
Rotation
Type of rotation if we have chosen mirt coordinates
Methodfscores
Method of calculation of the fscores in mirt process
NumNodos
Number of nodes for the gauss quadrature in EM algorith
tol
Cut point to stop the EM-algorith
maxiter
Maximum number of iterations in the EM-algorith
penalization
Value for the correction of the ridge regression
cte
Boolean value to choose if the model for each variable will have independent term
show
Boolean value to indicate if we want to see the results of our analysis

Details

The general algorithm used is essentially an alternated procedure in which parameters for rows and columns are computed in alternated steps repeated until convergence. Parameters for the rows are calculated by expectation (E-step) or by a external procedure (Multiple Correspondence Analysis or Principal Coordinates Analysis) and parameters for the columns are computed by maximization (M-step), i. e., by Nominal Logistic Regression. When the procedure for Row scores is external, only one iteration is performed and the procedure is called "External Nominal Logistic Biplot". Because the aim of the biplot is the representation

There are several options for the computation:

1.- Using the package mirt to obtain the row scores, i. e. using a solution obtained from a latent trait model. The column (item) parameters should be directly used by our biplot procedure but, because of the characteristics of the package that performs a default rotation after parameter estimation, we have to reestimate the item parametes to be coherent to the scores.

2.- Using our implementation of the EM algorithm alternating expected a porteriori scores and Ridge Nominal Logistic Regression for each variable.

3.- Using external coordinates for the rows taken from Multiple Correspondence Analysis or Principal Coordinates Analysis and fitting the response surfaces in just one step.

Equations that define a set of probability response surfaces (one for each category and each variable) are no longer sigmoid as in the binary case (Vicente-Villardon et al. (2006)). This means that the level curves are no longer straight lines and then, prediction of probabilities is not made by projection as in the usual linear biplots. For each variable, define a set of convex polygons that can be interpreted as "prediction" regions in the same way as in Gower & Hand (1996). Each pair of response surfaces defined by intersect in a straight line that, projected onto the space of predictors, is the set of points in which the probability of both categories is the same. Those lines are the candidates to be the edges of the convex polygons defining the prediction regions.

References

Hernandez, J. C. & Vicente-Villardon, J. L. (2013) Logistic Biplots for Nominal Data. Submitted. Preprint available at : https://www.researchgate.net/publication/256428288_Logistic_Biplot_for_Nominal_Data?ev=prf_pub

Vicente-Villardon, J., Galindo, M.P & Blazquez-Zaballos, A. (2006), Logistic biplots,Multiple Correspondence Analysis and related methods pp. 491--509.

Demey, J., Vicente-Villardon, J. L., Galindo, M.P. & Zambrano, A. (2008) Identifying Molecular Markers Associated With Classification Of Genotypes Using External Logistic Biplots. Bioinformatics, 24(24), 2832-2838.

Baker, F.B. (1992): Item Response Theory. Parameter Estimation Techniques. Marcel Dekker. New York.

Gabriel, K. (1971), The biplot graphic display of matrices with application to principal component analysis., Biometrika 58(3), 453--467.

Gabriel, K. R. (1998), Generalised bilinear regression, Biometrika 85(3), 689--700.

Gabriel, K. R. & Zamir, S. (1979), Lower rank approximation of matrices by least squares with any choice of weights, Technometrics 21(4), 489--498.

Gower, J. & Hand, D. (1996), Biplots, Monographs on statistics and applied probability. 54. London: Chapman and Hall., 277 pp.

Chalmers,R,P (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 1-29. URL http://www.jstatsoft.org/v48/i06/.

See Also

NominalLogBiplotEM, afc, PCoA

Examples

Run this code

data(HairColor)
nlbo = NominalLogisticBiplot(HairColor,sFormula=NULL,
numFactors=2,method="EM",penalization=0.2,show=FALSE)
nlbo

#data(PhD_nomCyL)
#cyl = NominalLogisticBiplot(PhD_nomCyL,sFormula=NULL,
#numFactors=2,method="EM",initial = 1,penalization=0.3,show=FALSE)
#summary(nlboPhD)
#plot(nlboPhD,QuitNotPredicted=TRUE,ReestimateInFocusPlane=TRUE,
#      planex = 1,planey = 2,proofMode=TRUE,LabelInd=FALSE,AtLeastR2 = 0.01,
#      xlimi=-1.5,xlimu=1.5,ylimi=-1.5,ylimu=1.5,linesVoronoi = TRUE,SmartLabels = FALSE,
#      PlotInd=TRUE,
#      CexInd = c(0.4),                                                           
#      PchInd = c(1),
#      ColorInd="azure3",
#      PlotVars=TRUE,LabelVar = TRUE,
#      PchVar = c(1,2,3,4,5,6,7,8,9),ColorVar = c("red","black","maroon","blue","green",
#      "chocolate4","coral3","brown","brown2"),
#      ShowResults=TRUE)


Run the code above in your browser using DataLab