Learn R Programming

IntLIM: Integration through LInear Modeling

IntLIM app is accessible via a server (no installation needed!).

Please click here. And let us know if additional functionalities would be useful (see contact info below).

IntLIM

Interpretation of metabolomics data is very challenging. Yet it can be eased through integration of metabolomics with other ‘omics’ data. The IntLIM package, which includes a user-friendly RShiny web app, aims to integrate multiple types of omics data. Unlike other approaches, IntLIM is focused on understanding how specific analyte associations are affected by phenotypic features. To this end, we develop a linear modeling approach that describes how analyte associations are affected by phenotype. The workflow involves the following steps: 1) input analyte level (e.g., expression or abundance) data files, 2) filter data sets by analyte level and imputed values, 3) run the linear model to extract FDR-adjusted interaction p-values, 4) filter results by p-values, interaction coefficient percentile, r-squared value, and Spearman correlation differences, and 5) plot/visualize specific analyte associations.

An example data set is provided within the package, and is a subset of the NCI-60 gene expression and metabolomics data (https://wiki.nci.nih.gov/display/NCIDTPdata/Molecular+Target+Data). The vignette outlines how to run the workflow. More details can be found in our publication "IntLIM: integration using linear models of metabolomics and gene expression data".

Citation

If you use IntLIM, please cite the following work:

Siddiqui JK, Baskin E, Liu M, Cantemir-Stone CZ, Zhang B, Bonneville R, McElroy JP, Coombes KR, Mathé EA. IntLIM: integration using linear models of metabolomics and gene expression data. BMC Bioinformatics. 2018 Mar 5;19(1):81. doi: 10.1186/s12859-018-2085-6.

PMID: 29506475; PMCID: PMC5838881 DOI: 10.1186/s12859-018-2085-6

To access, click here

IntLIM prerequisites

IntLIM is an R package and can be run on version >= 3.5.0.

Installation from Github

To install IntLIM, simply type the following in the R terminal:

install.packages("devtools")
library(devtools)
devtools::install_github("ncats/IntLIM")

Vignette

A detailed vignette can be found here.

Formatted Data and Analysis Codes

Formatted data and codes to reproduce the NCI-60 and breast cancer analyses can be obtained from the following GitHub repository:

https://github.com/ncats/IntLIMVignettes

Running IntLIM's user-friendly web app:

The package functions can be run directly in the R console.
Alternatively, to launch the web app, type the following in your R console:

library(IntLIM)
runIntLIMApp()

Contact

If you encounter any problems running on the software, or find installation problems or bugs, please start an issue on the Issues tab or email Ewy Mathe at Ewy.Mathe@nih.gov or Tara Eicher at Tara.Eicher@nih.gov. We are also very open to any comments, including how we can improve and ameliorate the package.

Copy Link

Version

Install

install.packages('IntLIM')

Monthly Downloads

38

Version

2.0.2

License

GPL-2

Maintainer

Tara Eicher

Last Published

August 22nd, 2022

Functions in IntLIM (2.0.2)

FilterDataFolds

Filter input data by abundance values (analyte data) and number of missing values.
RunCrossValidation

Runs the cross-validation end-to-end using the following steps: 1. Create multiple cross-validation folds from the data. 2. Filter each fold using the filtering criteria applied to the entire dataset. 3. Run IntLIM for all folds. 4. Process the results for all folds.
PlotDistributions

Get some stats after reading in data
PermuteIntLIM

Run permutations of the IntLIM code to search for random cross-omic associations in dataset
RemovePlusInCovars

RemovePlusInCovars
MarginalEffectsGraphDataframe

Creates a dataframe of the marginal effect of phenotype
PlotFoldOverlapUpSet

Makes an UpSet plot showing the filtered pairs of analytes found in each fold. This plot should only be made for cross-validation data.
OutputData

Output data into individual CSV files. All data will be zipped into one file with all data.
getstatsOneLM

Function that runs linear models for analyte vs. all analytes of the other type
PlotPCA

PCA plots of data for QC
getQuantileForInteractionCoefficient

Function that gets numeric cutoffs from percentile
PermutationCountSummary

Return the number of significant analytes and the number of permutations in which each analyte is significant. If plot = TRUE, show a box plot of number of significant analytes over permutations, overlaid with the number of significant analytes in the original data.
ProcessResultsContinuous

Retrieve significant pairs (aka filter out nonsignificant pairs) based on value of analyte:type interaction coefficient from linear model
PermutationPairSummary

Return the number of significant analytes / pairs per permutation and the number of permutations in which each analyte is significant. If plot = TRUE, show a box plot of number of significant analytes over permutations, overlaid with the number of significant analytes in the original data.
ReadData

Read in CSV file
InteractionCoefficientGraph

Graphs a scatterplot of pairs vs. the interaction coefficient for the pair
ProcessResultsAllFolds

Retrieve significant pairs, based on adjusted p-values, interaction coefficient percentile, and r-squared values. This is a wrapper for ProcessResults.
ShowStats

Get some stats after reading in data
MarginalEffectsGraph

Creates a dataframe of the marginal effect of phenotype
ProcessResults

Retrieve significant pairs, based on adjusted p-values. For each pair that is statistically significant, calculate the correlation within group1 (e.g. cancer) and the correlation within group2 (e.g. non-cancer). Users can then remove pairs with a difference in correlations between groups 1 and 2 less than a user-defined threshold.
RunLM

Function that runs linear models and returns interaction p-values.
multi.which

A which for multidimensional arrays. Mark van der Loo 16.09.2011
OutputResults

Output results into a zipped CSV file. Results include gene and metabolite pairs, along with model interaction p-values, and correlations in each group being evaluated.
PValueBoxPlots

Visualize the distribution of unadjusted p-values for all covariates from linear models using a bar chart.
RunIntLim

Run linear models and retrieve relevant statistics
RunIntLimAllFolds

Run linear models for all data folds. This is a wrapper to RunIntLim.
pvalCoefVolcano

'volcano' plot (difference in correlations vs p-values) of all pairs
runIntLIMApp

run shiny app
PlotPair

scatter plot of pairs (based on user selection)
PlotPairFlat

scatter plot of pairs (based on user selection). This version does not use highcharter and instead plots a base R plot.
getStatsAllLM

Function that runs Linear Models for all analytes
CreateCrossValFolds

Creates multiple cross-validation folds from the data. Format is a list of IntLIMData training and testing pairs. The "training" slot contains all data except that in the given fold, and the "testing" contains all data in the fold.
FilterData

Filter input data by abundance values and number of missing values.
HistogramPairs

histogram of analyte pairs depending upon independent or outcome analyte
DistPvalues

Visualize the distribution of unadjusted p-values from linear models
BuildDataAndLines

A helper function for the PlotPair functions (i.e. the highcharter one and the flat, base-R one).
IntLIM-package

IntLIM: Integration of Omics Data Using Linear Modeling
IntLimResults-class

IntLimResults class
DistRSquared

Visualize the distribution of unadjusted p-values from linear models
IntLimData-class

IntLimData class