LinearDA: Cross-validated Linear Discriminant Analysis

Description

A simple function to perform cross-validated Linear Discriminant Analysis

Usage

LinearDA(Data, classCol, selectedCols, CV = FALSE, cvFraction = 0.8,
  extendedResults = FALSE, SetSeed = TRUE, cvType = "createDataPartition",
  k = 10, foldSep, silent = FALSE, ...)

Arguments

Data

(dataframe) Data dataframe

classCol

(numeric) column number that contains the variable to be predicted

selectedCols

(optional) (numeric) all the columns of data that would be used either as predictor or as feature

(optional) (logical) perform Cross validation of training dataset? If TRUE, posterior probabilites are present with the model

cvFraction

(optional) (numeric) Fraction of data to keep for training data

extendedResults

(optional) (logical) Return extended results with model?

SetSeed

(optional) (logical) Whether to setseed or not. use SetSeed to seed the random number generator to get consistent results; set false only for permutation tests

cvType

(optional) (string) type of cross validation to perform if cvType = 'createDataPartition' a portion of data (cvFraction) is used, For cvType = 'Folds', a n-fold cross validation is performed. For cvType = "LOSO" a Leave-one-subject out cross-validation is performed

(optional) (numeric) the number of folds to use in case cvType = 'Folds'

foldSep

(numeric) mandatory column number for Leave-one-subject out cross-validation.

silent

(optional) (logical) whether to print messages during classification

...

(optional) additional arguments for the function

Value

Depending upon extendedResults. extendedResults FALSE = Acc of discrimination () extendedResults TRUE Acc Accuracy of discrimination and fitLDA the fit cross-validated LDA model. If CV = TRUE , Posterior probabilities are generated and stored in the model

Details

The function implements Linear Disciminant Analysis, a simple algorithm for classification based analyses .LDA builds a model composed of a number of discriminant functions based on linear combinations of data features that provide the best discrimination between two or more conditions/classes. The aim of the statistical analysis in LDA is thus to combine the data features scores in a way that a single new composite variable, the discriminant function, is produced (for details see Fisher, 1936; Rao, 1948)).

Examples

Run this code

# simple model with data partition of 80% and no extended results 
LDAModel <- LinearDA(Data = KinData, classCol = 1, 
selectedCols = c(1,2,12,22,32,42,52,62,72,82,92,102,112))
#outout
#       Predicted
#Actual  1  2
#1 51 32
#2 40 45
#"The accuracy of discrimination was 0.57"

LDAModel <- LinearDA(Data = KinData, classCol = 1,
 selectedCols = c(1,2,12,22,32,42,52,62,72,82,92,102,112),
CV=FALSE,cvFraction = 0.8,extendedResults = TRUE)

# For a 10 fold cross-validation without outputting messages 
LDAModel <-  LinearDA(Data = KinData, classCol = 1,
selectedCols = c(1,2,12,22,32,42,52,62,72,82,92,102,112),
extendedResults = FALSE,cvType = "Folds",k=10,silent = TRUE)

Run the code above in your browser using DataLab