Learn R Programming

DSFM (version 1.0.1)

Nutrimouse: Nutrimouse: Gene, Lipid and Grouping Data

Description

A data frame containing gene expression, lipid measurements, and grouping variables (diet and genotype) for 40 mice from a nutrigenomics study.

Usage

data(Nutrimouse)

Arguments

Format

A data frame with 40 observations on 143 variables:

  • 120 numeric variables for gene expression

  • 21 numeric variables for lipid measurements

  • 2 categorical variables: diet and genotype

Details

This dataset was created for integrative analysis of transcriptomic and lipidomic responses of mice to different diets and genotypes.

All numeric variables (genes and lipids) are centered and scaled. The categorical variables indicate the experimental design: five diet types and two genotypes.

This format is convenient for regression, classification, and dimension reduction techniques requiring a single data frame.

References

González, I., Déjean, S., Martin, P. G. P., and Baccini, A. (2009). CCA: An R package to extend canonical correlation analysis. Journal of Statistical Software, 23(12), 1--14.

Examples

Run this code
data(Nutrimouse)

# View structure
str(Nutrimouse)

# Boxplot of a gene across diets
boxplot(Nutrimouse[,1] ~ Nutrimouse$diet, main = "Gene 1 Expression by Diet")

# PCA on all numeric variables (excluding factors)
nutri_numeric <- Nutrimouse[, sapply(Nutrimouse, is.numeric)]
pca_result <- prcomp(nutri_numeric, scale. = TRUE)

# PCA plot
plot(pca_result$x[,1:2], col = as.numeric(Nutrimouse$diet), pch = 19)
legend("topright", legend = levels(Nutrimouse$diet), col = 1:5, pch = 19)

Run the code above in your browser using DataLab