CTLscan: CTLscan - Scan for Correlated Trait Locus (CTL)

Description

Scan for Correlated Trait Locus (CTL) in populations

Usage

CTLscan(genotypes, phenotypes, phenocol, nperm=100, 
strategy = c("Exact", "Full", "Pairwise"), conditions = NULL, qtls = NULL, ncores = 1,
parametric = FALSE, verbose = FALSE)

Arguments

genotypes

Matrix of genotypes. (individuals x markers)

phenotypes

Matrix of phenotypes. (individuals x phenotypes)

phenocol

Which phenotype column(s) should we analyse. Default: Analyse all phenotypes.

nperm

Number of permutations to perform. This parameter is not used when method="Exact".

strategy

The permutation strategy to use, either

Exact - Uses exact calculations to calculate the likelihood of a difference in correlation: Cor(AA) - Cor(BB). Using a Bonferroni correction.
Full - Most powerful analysis method - Compensate for marker and trait correlation structure (Breitling et al.).
Pairwise - Suitable when we have a lot of markers and only a few traits (< 50) (human GWAS)- Compensates only for marker correlation structure.

Note: Exact is the default and fastest option it uses a normal distribution for estimating p-values and uses bonferoni correction. It has however the least power to detect CTLs, the two other methods (Full and Pairwise) perform permutations to assign significance.

conditions

A vector of experimental conditions applied during the experiment. These conditions are used as covariates in the QTL modeling step.

qtls

Used to provide QTL results, when external QTL results are available.

ncores

Number of CPU cores to use during the analysis.

parametric

Use non-parametric testing (Spearman) or parametric testing (Pearson). The DEFAULT is to use non-parametric tests which are less sensitive to outliers in the phenotype data.

verbose

Be verbose.

Value

CTLobject, a list with at each index (i) an CTLscan object:

$dcor - Matrix of Z scores (method=Exact), or Power/Adjacency Z scores or for each trait at each marker (n.markers x n.phenotypes)
$perms - Vector of maximum scores obtained during permutations (n.perms)
$ctl - Matrix of LOD scores for CTL likelihood of phenotype i (n.markers x n.phenotypes)
$qtl - Vector of LOD scores for QTL likelihood of phenotype i (n.markers)

Details

By default the algorithm will not do QTL mapping, the qtl component of the output is an vector of 0 scores for LOD. This is to remove some computational burden, please use the have.qtls parameter to provide QTL data. Some computational bottleneck of the algorithm are:

RAM available to the system with large number of markers (100K+) and/or phenotypes (100K+).
Computational time with large sample sizes (5000+) and/or huge amount of phenotype data (100K+).
Very very huge amounts of genotype markers (1M+)

Some way of avoiding these problems are: CTL mapping using only a single chromosome at a time and / or selecting a smaller subsets of phenotype data for analysis.

References

TODO

Examples

Run this code

# NOT RUN {
  library(ctl)
  data(ath.metabolites)                 # Arabidopsis Thaliana data set

  ctlscan <- CTLscan(ath.metab$genotypes, ath.metab$phenotypes, phenocol=1:4)
  ctlscan

  # Genetic regions with significant CTLs found for the first phenotype
  CTLregions(ctlscan, ath.metab$map, phenocol = 1)
  
  summary <- CTLsignificant(ctlscan)    # Matrix of Trait, Marker, Trait interactions 
  summary                               # Get a list of significant CTLs

  nodes <- ctl.lineplot(ctlscan, ath.metab$map)  # Line plot the phenotypes
  nodes
# }

Run the code above in your browser using DataLab