Learn R Programming

⚠️There's a newer version (0.14.2) of this package.Take me there.

Multivariate Data Analysis Tools

mdatools is an R package for preprocessing, exploring and analysis of multivariate data. The package provides methods mostly common for Chemometrics. It was created for an introductory PhD course on Chemometrics given at Section of Chemical Engineering, Aalborg University.

The general idea of the package is to collect most widespread chemometric methods and give a similar "user interface" (or rather API) for using them. So if a user knows how to make a model and visualize results for one method, he or she can easily do this for the others.

For more details and examples read a Bookdown tutorial.

If you want to cite the package, please use the following: Sergey Kucheryavskiy, mdatools – R package for chemometrics, Chemometrics and Intelligent Laboratory Systems, Volume 198, 2020 (DOI: 10.1016/j.chemolab.2020.103937).

What is new

Latest release (0.10.3) is available via GitHub (released 28.03.2020). The last major version (0.10.0) contains a lot of improvements and most of code has been refactored. Check full list of changes carefully as it can lead to a (although very small) incompatibility. The Bookdown tutorial has been also revised significantly. This version will be released on CRAN at the end of February.

Starting from this release, several badges added to the top of this file. The first shows current build status, which makes the use of Git Hub hosted developer version more secure. If build is passing it means that the code currently available in master branch passes all CRAN checks and internal tests. Other badges show number of downloads from CRAN and GitHub as well as how much of code is currently covered with tests.

How to install

The package is available from CRAN by usual installing procedure. However, due to restrictions in CRAN politics regarding number of submissions (one in 3-4 month), mostly major releases will be published there (with 2-3 weeks delay after GitHub release as more thorough testing is needed). You can download a zip-file with source package and install it using the install.packages command, e.g. if the downloaded file is mdatools_0.10.3.tar.gz and it is located in a current working directory, just run the following:

install.packages("mdatools_0.10.3.tar.gz")

If you have devtools package installed, the following command will install the current developer version from the master branch of GitHub repository (do not forget to load the devtools package first):

install_github("svkucheryavski/mdatools")

Copy Link

Version

Install

install.packages('mdatools')

Monthly Downloads

1,342

Version

0.10.3

License

MIT + file LICENSE

Maintainer

Sergey Kucheryavskiy

Last Published

March 28th, 2020

Functions in mdatools (0.10.3)

as.matrix.plsres

as.matrix method for PLS results
as.matrix.regres

as.matrix method for regression results
capitalize

Capitalize text or vector with text values
as.matrix.ldecomp

as.matrix method for ldecomp object
as.matrix.plsdares

as.matrix method for PLS-DA results
categorize

Categorize PCA results
as.matrix.simcamres

as.matrix method for SIMCAM results
as.matrix.classres

as.matrix method for classification results
as.matrix.regcoeffs

as.matrix method for regression coefficients class
as.matrix.simcares

as.matrix method for SIMCA classification results
classify.plsda

PLS-DA classification
chisq.crit

Calculates critical limits for distance values using Chi-square distribution
classres.getPerformance

Calculation of classification performance parameters
classmodel.processRefValues

Check reference class values and convert it to a factor if necessary
getCalibrationData.simcam

Get calibration data
ddmoments.param

Calculates critical limits for distance values using Data Driven moments approach
confint.regcoeffs

Confidence intervals for regression coefficients
categorize.pca

Categorize PCA results based on orthogonal and score distances.
ddrobust.param

Calculates critical limits for distance values using Data Driven robust approach
getConfidenceEllipse

Compute confidence ellipse for a set of points
categorize.pls

Categorize data rows based on PLS results and critical limits for total distance.
classify.simca

SIMCA classification
crossval

Generate sequence of indices for cross-validation
dd.crit

Calculates critical limits for distance values using Data Driven moments approach
classres

Results of classification
crossval.str

String with description of cross-validation method
crossval.getParams

Define parameters based on 'cv' value
getSelectedComponents

Get selected components
getConfusionMatrix

Confusion matrix for classification results
getConfusionMatrix.classres

Confusion matrix for classification results
crossval.simca

Cross-validation of a SIMCA model
crossval.regmodel

Cross-validation of a regression model
chisq.prob

Calculate probabilities for distance values using Chi-square distribution
getProbabilities.simca

Probabilities of class belonging for PCA/SIMCA results
getSelectivityRatio

Selectivity ratio
ellipse

Create ellipse on the current plot
getConvexHull

Compute coordinates of a closed convex hull for data points
fprintf

Imitation of fprinf() function
getDataLabels

Create a vector with labels for plot series
getCalibrationData

Calibration data
getProbabilities

Get class belonging probability
getVIPScores.pls

VIP scores for PLS model
getProbabilities.pca

Probabilities for residual distances
getRegcoeffs

Get regression coefficients
getCalibrationData.pca

Returns matrix with original calibration data
getSelectivityRatio.pls

Selectivity ratio for PLS model
getLabelsAsIndices

Create labels as column or row indices
getRegcoeffs.regmodel

Regression coefficients for PLS model'
hotelling.crit

Calculate critical limits for distance values using Hotelling T2 distribution
getLabelsAsValues

Create labels from data values
getMainTitle

Get main title
ldecomp.getDistances

Compute score and residual distances
getVIPScores

VIP scores
ldecomp.plotResiduals

Residuals distance plot for a set of ldecomp objects
mda.inclcols

Include/unhide the excluded columns
getPlotColors

Define colors for plot series
mda.cbind

A wrapper for cbind() method with proper set of attributes
mda.im2data

Convert image to data matrix
getRes

Return list with valid results
ipls

Variable selection with interval PLS
ldecomp.getQLimits

Compute critical limits for orthogonal distances (Q)
hotelling.prob

Calculate probabilities for distance values and given parameters using Hotelling T2 distribution
ldecomp.getLimitsCoordinates

Compute coordinates of lines or curves with critical limits
mda.exclcols

Exclude/hide columns in a dataset
imshow

show image data as an image
ipls.forward

Runs the forward iPLS algorithm
mda.purgeRows

Removes excluded (hidden) rows from data
mda.purgeCols

Removes excluded (hidden) colmns from data
mda.exclrows

Exclude/hide rows in a dataset
ipls.backward

Runs the backward iPLS algorithm
jm.crit

Calculate critical limits for distance values using Jackson-Mudholkar approach
jm.prob

Calculate probabilities for distance values and given parameters using Hotelling T2 distribution
mdaplot

Plotting function for a single set of objects
mdaplot.getYTickLabels

Prepare yticklabels for plot
mdaplot.getYTicks

Prepare yticks for plot
ldecomp.getT2Limits

Compute critical limits for score distances (T2)
mda.data2im

Convert data matrix to an image
ldecomp.getLimParams

Compute parameters for critical limits based on calibration results
mdaplot.areColors

Check color values
ldecomp.getVariances

Compute explained variance
mda.inclrows

include/unhide the excluded rows
mda.purge

Removes excluded (hidden) rows and colmns from data
ldecomp

Class for storing and visualising linear decomposition of dataset (X = TP' + E)
mdaplot.getXAxisLim

Calculate limits for x-axis.
mdaplot.getXTickLabels

Prepare xticklabels for plot
mdaplot.showColorbar

Plot colorbar
mda.rbind

A wrapper for rbind() method with proper set of attributes
pcares

Results of PCA decomposition
mda.df2mat

Convert data frame to a matrix
mdaplot.showLines

Plot lines
mda.setimbg

Remove background pixels from image data
mda.show

Wrapper for show() method
pellets

Image data
mda.subset

A wrapper for subset() method with proper set of attributed
mda.getattr

Get data attributes
mda.getexclind

Get indices of excluded rows or columns
mdaplotg

Plotting function for several plot series
mdaplotg.processParam

Check mdaplotg parameters and replicate them if necessary
mdaplot.getColors

Color values for plot elements
mdaplot.formatValues

Format vector with numeric values
mdaplotg.prepareData

Prepare data for mdaplotg
mdaplotg.getLegend

Create and return vector with legend values
mdatools

Package for Multivariate Data Analysis (Chemometrics)
mda.t

A wrapper for t() method with proper set of attributes
plot.plsres

Overview plot for PLS results
plotCooman

Cooman's plot
plot.simcamres

Model overview plot for SIMCAM results
plot.plsdares

Overview plot for PLS-DA results
plot.simcam

Model overview plot for SIMCAM
pca.cal

PCA model calibration
plotConvexHull

Add convex hull for groups of points on scatter plot
pca.getB

Low-dimensional approximation of data matrix X
pca

Principal Component Analysis
mda.setattr

Set data attributes
plot.classres

Plot function for classification results
plotMisclassified.classmodel

Misclassified ratio plot for classification model
plotDiscriminationPower.simcam

Discrimination power plot for SIMCAM model
plotDiscriminationPower

Discrimination power plot
plotProbabilities

Plot for class belonging probability
plotPredictions.simcamres

Prediction plot for SIMCAM results
plotMisclassified

Misclassification ratio plot
plotRegcoeffs

Regression coefficients plot
mdaplot.getXTicks

Prepare xticks for plot
plotSelection.ipls

iPLS performance plot
plotRegcoeffs.regmodel

Regression coefficient plot for regression model
mdaplot.plotAxes

Create axes plane
plot.regres

Plot method for regression results
plotHist

Statistic histogram
plotDensity

Show plot series as density plot (using hex binning)
plotCumVariance.pca

Cumulative explained variance plot for PCA model
plotHist.randtest

Histogram plot for randomization test results
plot.ipls

Overview plot for iPLS results
plotLoadings

Loadings plot
plot.simca

Model overview plot for SIMCA
mdaplot.prepareColors

Prepare colors based on palette and opacity value
plot.pls

Model overview plot for PLS
mdaplot.getYAxisLim

Calculate limits for y-axis.
plotLoadings.pca

Loadings plot for PCA model
plotBiplot.pca

PCA biplot
plotCumVariance

Variance plot
plotDistDoF

Degrees of freedom plot for both distances
plot.plsda

Model overview plot for PLS-DA
plotConfidenceEllipse

Add confidence ellipse for groups of points on scatter plot
plotCumVariance.ldecomp

Cumulative explained variance plot
plotErrorbars

Show plot series as error bars
mdaplotg.showLegend

Show legend for mdaplotg
plotPredictions

Predictions plot
plotSelectivityRatio

Selectivity ratio plot
plotPerformance

Classification performance plot
plotPerformance.classmodel

Performance plot for classification model
plotVIPScores

VIP scores plot
pca.run

Runs one of the selected PCA methods
people

People data
pinv

Pseudo-inverse matrix
pca.svd

Singular Values Decomposition based PCA algorithm
mdaplotg.getYLim

Compute y-axis limits for mdaplotg
mdaplotg.getXLim

Compute x-axis limits for mdaplotg
mdaplotyy

Create line plot with double y-axis
plotVIPScores.pls

VIP scores plot for PLS model
pca.nipals

NIPALS based PCA algorithm
pca.mvreplace

Replace missing values in data
plotPredictions.classmodel

Predictions plot for classification model
plotRegressionLine

Add regression line for data points
plotResiduals

Residuals plot
plotScores

Scores plot
plotScores.ldecomp

Scores plot
plot.pca

Model overview plot for PCA
plot.pcares

Plot method for PCA results object
plotBars

Show plot series as bars
plotVariance

Variance plot
plotSpecificity.classres

Specificity plot for classification results
plotPredictions.classres

Prediction plot for classification results
plotCorr

Correlation plot
plotModelDistance.simcam

Model distance plot for SIMCAM model
plotHotellingEllipse

Hotelling ellipse
plotBiplot

Biplot
plotCorr.randtest

Correlation plot for randomization test results
plot.randtest

Plot for randomization test results
plotLines

Show plot series as set of lines
plotT2DoF

Degrees of freedom plot for score distance (Nh)
plotWeights.pls

X loadings plot for PLS
plotXCumVariance

X cumulative variance plot
plotCooman.simcam

Cooman's plot for SIMCAM model
plotMisclassified.classres

Misclassified ratio plot for classification results
plotPerformance.classres

Performance plot for classification results
plotExtreme

Shows extreme plot for SIMCA model
plot.regcoeffs

Regression coefficients plot
plotCooman.simcamres

Cooman's plot for SIMCAM results
plotModelDistance

Model distance plot
plotExtreme.pca

Extreme plot
plotPointsShape

Add confidence ellipse or convex hull for group of points
plotXResiduals

X residuals plot
plotVariance.ldecomp

Explained variance plot
plotXYResiduals.pls

Residual XY-distance plot
plotXResiduals.pls

Residual distance plot for decomposition of X data
plotModellingPower

Modelling power plot
plotXYResiduals.plsres

Residual distance plot
plotYCumVariance.pls

Cumulative explained Y variance plot for PLS
plotseries

Create plot series object based on data, plot type and parameters
plotYCumVariance

Y cumulative variance plot
plotXLoadings

X loadings plot
plotXLoadings.pls

X loadings plot for PLS
plotXYScores.plsres

XY scores plot for PLS results
plotPredictions.regmodel

Predictions plot for regression model
plotYCumVariance.plsres

Explained cumulative Y variance plot for PLS results
plotQDoF

Degrees of freedom plot for orthogonal distance (Nh)
plotProbabilities.classres

Plot for class belonging probability
plsdares

PLS-DA results
plotRMSE.regmodel

RMSE plot for regression model
plotRMSE

RMSE plot
plotRMSE.ipls

RMSE development plot
plotPredictions.regres

Predictions plot for regression results
plotPredictions.simcam

Predictions plot for SIMCAM model
plotRMSE.regres

RMSE plot for regression results
pls.cal

PLS model calibration
pls

Partial Least Squares regression
pls.getLimitsCoordinates

Compute coordinates of lines or curves with critical limits
plotResiduals.ldecomp

Residual distance plot
predict.simcam

SIMCA multiple classes predictions
plsres

PLS results
prep.snv

Standard Normal Variate transformation
print.regcoeffs

print method for regression coefficients class
prep.savgol

Savytzky-Golay filter
print.randtest

Print method for randtest object
print.simca

Print method for SIMCA model object
print.simcam

Print method for SIMCAM model object
prep.autoscale

Autoscale values
plotResiduals.regres

Residuals plot for regression results
print.pca

Print method for PCA model object
plotScores.pca

Scores plot for PCA model
plotScatter

Show plot series as set of points
plotSelectivityRatio.pls

Selectivity ratio plot for PLS model
plotSensitivity

Sensitivity plot
plotResiduals.pca

Residuals distance plot for PCA model
plotSpecificity

Specificity plot
print.pcares

Print method for PCA results object
plotVariance.plsres

Explained X variance plot for PLS results
plotWeights

Plot for PLS weights
regcoeffs.getStats

Distribution statistics for regression coeffificents
selectCompNum.pca

Select optimal number of components for PCA model
plotSpecificity.classmodel

Specificity plot for classification model
plotXScores.pls

X scores plot for PLS
selectCompNum.pls

Select optimal number of components for PLS model
selratio

Selectivity ratio calculation
regres

Regression results
selectCompNum

Select optimal number of components for a model
setDistanceLimits.pca

Compute and set statistical limits for Q and T2 residual distances.
repmat

Replicate matric x
setDistanceLimits.pls

Compute and set statistical limits for residual distances.
plotXScores.plsres

X scores plot for PLS results
setDistanceLimits

Set residual distance limits
summary.classres

Summary statistics about classification result object
summary.ipls

Summary for iPLS results
summary.plsres

summary method for PLS results object
plotSelection

Selected intervals plot
plotVariance.pca

Explained variance plot for PCA model
plotVariance.pls

Variance plot for PLS
plotXVariance.pls

Explained X variance plot for PLS
plotXVariance

X variance plot
plotXYScores

XY scores plot
plotXYLoadings

X loadings plot
plotYVariance.pls

Explained Y variance plot for PLS
plotXVariance.plsres

Explained X variance plot for PLS results
plotSensitivity.classmodel

Sensitivity plot for classification model
splitExcludedData

Split the excluded part of data
plotXResiduals.plsres

X residuals plot for PLS results
plotSensitivity.classres

Sensitivity plot for classification results
plotXCumVariance.pls

Cumulative explained X variance plot for PLS
plotXCumVariance.plsres

Explained cumulative X variance plot for PLS results
plotYVariance.plsres

Explained Y variance plot for PLS results
splitPlotData

Split dataset to x and y values depending on plot type
plotXYScores.pls

XY scores plot for PLS
summary.randtest

Summary method for randtest object
predict.pca

PCA predictions
predict.pls

PLS predictions
plotXYLoadings.pls

XY loadings plot for PLS
plotYResiduals

Y residuals plot
plotXYResiduals

Plot for XY-residuals
plotYResiduals.plsres

Y residuals plot for PLS results
plotXScores

X scores plot
summary.regres

summary method for regression results object
pls.getZLimits

Compute critical limits for orthogonal distances (Q)
pls.run

Runs selected PLS algorithm
predict.plsda

PLS-DA predictions
summary.simca

Summary method for SIMCA model object
plotYVariance

Y variance plot
pls.simpls

SIMPLS algorithm
plotYResiduals.regmodel

Y residuals plot for regression model
preparePlotData

Take dataset and prepare them for plot
plsda

Partial Least Squares Discriminant Analysis
print.classres

Print information about classification result object
prep.msc

Multiplicative Scatter Correction transformation
predict.simca

SIMCA predictions
regres.slope

Slope
print.plsdares

Print method for PLS-DA results object
print.plsres

print method for PLS results object
print.simcares

Print method for SIMCA results object
print.simcamres

Print method for SIMCAM results object
regress.addattrs

Add names and attributes to matrix with statistics
print.ipls

Print method for iPLS
print.ldecomp

Print method for linear decomposition
simcares

Results of SIMCA one-class classification
simcam

SIMCA multiclass classification
simca

SIMCA one-class classification
randtest

Randomization test for PLS regression
prep.norm

Normalization
summary.pcares

Summary method for PCA results object
summary.pls

Summary method for PLS model object
simdata

Spectral data of polyaromatic hydrocarbons mixing
summary.simcares

Summary method for SIMCA results object
regcoeffs

Regression coefficients
vipscores

VIP scores for PLS model
print.pls

Print method for PLS model object
print.plsda

Print method for PLS-DA model object
print.regmodel

Print method for PLS model object
print.regres

print method for regression results object
regres.rmse

RMSE
regres.r2

Determination coefficient
showPredictions

Predictions
regres.bias

Prediction bias
showPredictions.classres

Show predicted class values
summary.ldecomp

Summary statistics for linear decomposition
regres.err

Error of prediction
summary.pca

Summary method for PCA model object
showDistanceLimits

Show residual distance limits
summary.plsda

Summary method for PLS-DA model object
summary.plsdares

Summary method for PLS-DA results object
simcam.getPerformanceStats

Performance statistics for SIMCAM model
showLabels

Show labels on plot
simcamres

Results of SIMCA multiclass classification
summary.regcoeffs

Summary method for regcoeffs object
summary.regmodel

Summary method for regression model object
summary.simcam

Summary method for SIMCAM model object
summary.simcamres

Summary method for SIMCAM results object