CTLscan: CTLscan - Scan for Correlated Trait Locus (CTL)

Description

Scan for Correlated Trait Locus (CTL) in populations

Usage

CTLscan(genotypes, phenotypes, phenocol, nperm=100, nthreads = 1, 
strategy = c("Exact", "Full", "Pairwise"),
parametric = FALSE, adjust=TRUE, qtl = TRUE, verbose = FALSE)

Arguments

genotypes

Matrix of genotypes. (individuals x markers)

phenotypes

Matrix of phenotypes. (individuals x phenotypes)

phenocol

Which phenotype column(s) should we analyse. Default: Analyse all phenotypes.

nperm

Number of permutations to perform. This parameter is not used when method="Exact".

nthreads

Number of CPU cores to use during the analysis.

strategy

The permutation strategy to use, either

Exact - Uses exact calculations to calculate the likelihood of a difference in correlation: Cor(AA) - Cor(BB). Using a Bonferroni correction.
Full - Most powerful analysis method - Compensate for marker and trait correlation structure (Breitling et al.).
Pairwise - Suitable when we have a lot of markers and only a few traits (< 50) (human GWAS)- Compensates only for marker correlation structure.

Note: Exact is the default and fastest option it uses a normal distribution for estimating p-values and uses bonferoni correction. It has however the least power to detect CTLs, the two other methods (Full and Pairwise) perform permutations to assign significance.

parametric

Use non-parametric testing (Spearman) or parametric testing (Pearson). The DEFAULT is to use non-parametric tests which are less sensitive to outliers in the phenotype data.

adjust

Adjust p-values for multiple testing (only used when strategy = Exact).

qtl

Use the internal slow QTL mapping method to map QTLs.

verbose

Be verbose.

Value

CTLobject, a list with at each index (i) an CTLscan object:

$dcor - Matrix of Z scores (method=Exact), or Power/Adjacency Z scores or for each trait at each marker (n.markers x n.phenotypes)
$perms - Vector of maximum scores obtained during permutations (n.perms)
$ctl - Matrix of LOD scores for CTL likelihood of phenotype i (n.markers x n.phenotypes)
$qtl - Vector of LOD scores for QTL likelihood of phenotype i (n.markers)

Details

By default the algorithm will not do QTL mapping, the qtl component of the output is an vector of 0 scores for LOD. This is to remove some computational burden, please use the have.qtls parameter to provide QTL data. Some computational bottleneck of the algorithm are:

RAM available to the system with large number of markers (100K+) and/or phenotypes (100K+).
Computational time with large sample sizes (5000+) and/or huge amount of phenotype data (100K+).
Very very huge amounts of genotype markers (1M+)

Some way of avoiding these problems are: CTL mapping using only a single chromosome at a time and / or selecting a smaller subsets of phenotype data for analysis.

References

TODO

Examples

Run this code

# NOT RUN {
  library(ctl)
  data(ath.metabolites)                 # Arabidopsis Thaliana data set

  ctlscan <- CTLscan(ath.metab$genotypes, ath.metab$phenotypes, phenocol=1:4)
  ctlscan

  # Genetic regions with significant CTLs found for the first phenotype
  CTLregions(ctlscan, ath.metab$map, phenocol = 1)
  
  summary <- CTLsignificant(ctlscan)    # Matrix of Trait, Marker, Trait interactions 
  summary                               # Get a list of significant CTLs

  nodes <- ctl.lineplot(ctlscan, ath.metab$map)  # Line plot the phenotypes
  nodes
# }

Run the code above in your browser using DataLab