Learn R Programming

specmine

The goal of specmine is to provide a set of methods for metabolomics data analysis, including data loading in different formats, pre-processing, metabolite identification, univariate and multivariate data analysis, machine learning, feature selection and pathway analysis. Case studies can be found on the website: http://bio.di.uminho.pt/metabolomicspackage/index.html. This package suggests ‘rcytoscapejs’, a package not in mainstream repositories. If you need to install it, use: devtools::install_github('cytoscape/r-cytoscape.js@v0.0.7').

Installation

You can install the released version of specmine from CRAN with:

install.packages("specmine")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("BioSystemsUM/specmine")

Example

This is a basic example which shows you how to load the namespace of specmine and add it to your search list:

library(specmine)

Copy Link

Version

Install

install.packages('specmine')

Monthly Downloads

24

Version

3.1.6

License

GPL (>= 2)

Issues

Pull Requests

Stars

Forks

Maintainer

Miguel Rocha

Last Published

September 21st, 2021

Functions in specmine (3.1.6)

MAIT_identify_metabolites

MAIT metabolite identification
apply_by_variable

Apply function to variables
apply_by_sample

Apply function to samples
aggregate_samples

Aggregate samples
background_correction

Background correction
baseline_correction

Baseline correction
apply_by_groups

Apply by groups
apply_by_group

Apply by group
absorbance_to_transmittance

Convert absorbance to transmittance
boxplot_variables

Boxplot of variables
convert_chebi_to_kegg

Convert CHEBI codes to KEGG codes.
convert_from_chemospec

Convert from ChemoSpec
convert_hmdb_to_kegg

Convert HMDB codes to KEGG codes.
correlations_test

Correlations test
convert_keggpathway_2_reactiongraph

Convert KEGGPathway object to graph object.
count_missing_values

Count missing values
aov_all_vars

Analysis of variance
compare_regions_by_sample

Compare regions by sample
clustering

Perform cluster analysis
detect_nmr_peaks_from_dataset

Detection of the peaks in an NMR spectra dataset.
correlation_test

Correlation test of two variables or samples
dataset_from_peaks

Dataset from peaks
data_correction

Data correction
correlations_dataset

Dataset correlations
check_dataset

Check dataset
create_pathway_with_reactions

Creates the pathway, with reactions included in the nodes.
convert_multiple_spcmnm_to_kegg

Convert specmine metabolite codes to KEGG codes.
check_2d_dataset

Check 2D dataset.
count_missing_values_per_sample

Count missing values per sample
flat_pattern_filter

Flat pattern filter
first_derivative

First derivative
cubic_root_transform

Cubic root transformation
convert_to_factor

Convert metadata to factor
boxplot_vars_factor

Boxplot of variables with metadata's variable factors
feature_selection

Perform feature selection
get_OrganismsCodes

Get all organisms in KEGG.
get_MetabolitePath

Returns an object of KEGGPathway of the pathway especified in pathcode.
get_x_values_as_num

Get x-axis values as numbers
get_metadata

Get metadata
get_metabolights_study_samples_files

Get list of files from an assay of the MetaboLights study and saves it in a csv file.
get_x_label

Get x-axis label
impute_nas_knn

Impute missing values with KNN
get_metabolights_study_metadata_assay

Download metadata file from an assay of MetaboLights study
impute_nas_linapprox

Impute missing values with linear approximation
get_metabolights_study_files_assay

Download data files from an assay of MetaboLights study
count_missing_values_per_variable

Count missing values per variable
heatmap_correlations

Correlations heatmap
hierarchical_clustering

Perform hierarchical clustering analysis
get_sample_names

Get sample names
create_dataset

Create dataset
create_2d_dataset

Create 2D dataset
dendrogram_plot_col

Plot dendrogram
get_sample_2d_data

Get data
dendrogram_plot

Plot dendrogram
get_data_value

Get data value
get_data_as_df

Get data as data frame
get_metabolights_study

Download MetaboLights study files.
get_metabPaths_org

Get the metabolic pathways present in given organism.
impute_nas_mean

Impute missing values with mean
impute_nas_median

Impute missing values with median
metabolights_studies_list

List the study IDs available in the MetaboLights database.
ksTest_dataset

Kolmogorov-Smirnov tests on dataset
normalize

Normalize data
metadata_as_variables

Metadata as variables
kruskalTest_dataset

Kruskal-Wallis tests on dataset
normalize_samples

Normalize samples
get_cpd_names

Get the names of the compounds that correspond to the kegg codes given.
peak_detection2d

Detection of the peaks in an 2D NMR spectra dataset.
filter_feature_selection

Perform selection by filter
get_data

Get data
get_data_values

Get data values
plot_kruskaltest

Plot Kruskal-Wallis tests results
peaks_per_sample

Peaks per sample
plot_kstest

Plot Kolmogorov-Smirnov tests results
get_files_list_per_assay

Get list of files per assay for MetaboLights study.
read_spc_nosubhdr

Import for Thermo Galactic's spc file format These functions allow to import .spc files.
remove_data_variables

Remove data variables
predict_samples

Predict samples
remove_metadata_variables

Remove metadata's variables
pca_biplot

PCA biplot
nmr_identification

NMR metabolite identification
log_transform

Logarithmic transformation.
linregression_onevar

Linear regression on one variable
pca_analysis_dataset

PCA analysis (classical)
multiplot

Multiplot
get_metadata_value

Get metadata value
low_level_fusion

Low level fusion
kmeans_result_df

Show cluster's members
kmeans_plot

Plot kmeans clusters
subset_x_values

Subset x-values
get_metadata_var

Get metadata variable
mean_centering

Mean centering
subset_x_values_by_interval

Subset x-values by interval
find_equal_samples

Find equal samples
fold_change_var

Fold change applied on two variables
fold_change

Fold change analysis
pca_pairs_kmeans_plot

PCA k-means pairs plot
pca_scoresplot3D

3D PCA scores plot
pca_pairs_plot

PCA pairs plot
read_data_dx

Read data from (J)DX files
read_data_spc

Read data from SPC files
values_per_sample

Values per peak
get_samples_names_spc

Get sample's names from SPC files
pca_scoresplot2D

2D PCA scores plot
get_x_values_as_text

Get x-axis values as text
recursive_feature_elimination

Perform recursive feature elimination
get_samples_names_dx

Get sample's names from DX files
variables_as_metadata

Variables as metadata
get_value_label

Get value label
impute_nas_value

Impute missing values with value replacement
indexes_to_xvalue_interval

Get the x-values of a vector of indexes
get_paths_with_cpds_org

Get only the paths of the organism that contain one or more of the given compounds.
get_type

Get type of data
linreg_all_vars

Linear Regression
group_peaks

Group peaks
get_peak_values

Get peak values
pca_plot_3d

3D pca plot
linreg_pvalue_table

Linear regression p-values table
linreg_rsquared

Linear regression r-squared
linreg_coef_table

Linear regression coefficient table
multifactor_aov_all_vars

Multifactor ANOVA
multiClassSummary

Multi Class Summary
msc_correction

Multiplicative scatter correction
remove_data

Remove data
missingvalues_imputation

Missing values imputation
num_samples

Get number of samples
num_x_values

Get number of x values
pca_biplot3D

3D PCA biplot (interactive)
pca_robust

PCA analysis (robust)
pca_scoresplot3D_rgl

3D PCA scores plot (interactive)
pca_screeplot

PCA scree plot
pca_importance

PCA importance
kmeans_clustering

Perform k-means clustering analysis
merge_data_metadata

Merge data and metadata
is_spectra

Check type of data
plot_2d_spectra

Plot of 2D spectra
pathway_analysis

Creates the metabolic pathway wanted. If any of the given compounds is present in the pathway, it is coloured differently.
offset_correction

Offset correction
read_varian_2dspectra_raw

Function that reads raw 2D spectra (intensity over time spectra) from the varian format and processes them to ppm spectra.
read_varian_spectra_raw

Function that reads raw spectra (intensity over time spectra) from the varian format and processes them to ppm spectra.
peaks_per_samples

Peaks per samples
plot_fold_change

Plot fold change results
plot_anova

Plot ANOVA results
merge_datasets

Merge two datasets
replace_data_value

Replace data value
remove_x_values_by_interval

Remove x-values by interval
set_x_values

Set new x-values
multifactor_aov_varexp_table

Multifactor ANOVA variability explained table
pca_kmeans_plot3D

3D PCA k-means plot (interactive)
multifactor_aov_pvalues_table

Multifactor ANOVA p-values table
pca_kmeans_plot2D

2D PCA k-means plot
plot_peaks

Plot the peaks of a MS or NMR dataset.
plot_ttests

Plot t-tests results
stats_by_variable

Statistics of variables
tTests_dataset

t-Tests on dataset
summary_var_importance

Summary of variables importance
shift_correction

Shift correction
subset_by_samples_and_xvalues

Subset by samples and x-values
plotvar_twofactor

Plot variable distribution on two factors
plot_regression_coefs_pvalues

Plot regression coefficient and p-values
volcano_plot_fc_tt

Volcano plot
x_values_to_indexes

Get x-values indexes
read_dataset_csv

Read dataset from CSV
replace_metadata_value

Replace metadata's value
savitzky_golay

Savitzky-golay transformation
plot_spectra

Plot spectra
set_x_label

Set x-label
subset_metadata

Subset metadata
set_value_label

Set value label
read_csvs_folder

Read CSVs from folder
remove_samples

Remove samples
sum_2d_dataset

2D Dataset summary
remove_samples_by_na_metadata

Remove samples by NA on metadata
read_Bruker_files

Read Bruker processed spectra.
subset_random_samples

Subset random samples
read_data_csv

Read CSV data
plot_spectra_simple

Plot spectra (simple)
smoothing_interpolation

Smoothing interpolation
read_dataset_dx

Read dataset from (J)DX files
snv_dataset

Standard Normal Variate
remove_samples_by_nas

Remove samples by NAs
read_dataset_spc

Read dataset from SPC files
remove_variables_by_nas

Remove variables by NAs
read_metadata

Read metadata
set_sample_names

Set samples names
set_metadata

Set new metadata
subset_samples

Subset samples
sum_dataset

Dataset summary
remove_peaks_interval

Remove interval of peaks
read_Bruker_files_2d

Read Bruker processed 2D spectra.
read_ms_spectra

Read MS spectra
read_multiple_csvs

Read multiple CSVs
train_models_performance

Train models
transform_data

Transform data
remove_peaks_interval_sample_list

Remove interval of peaks (sample list)
subset_samples_by_metadata_values

Subset samples by metadata values
scaling

Scale dataset
spectra_options

Information on the library of NMR reference spectra in our package.
train_and_predict

Train and predict
stats_by_sample

Statistics of samples
scaling_samples

Scale data matrix
train_classifier

Train classifier
transmittance_to_absorbance

Convert transmittance to absorbance
values_per_peak

Values per peak
xvalue_interval_to_indexes

Get indexes of an interval of x-values