NormalizedASKAT.region: Run the normalized ASKAT method on a genomic region defined by a start and a stop base pair coordinate

Description

Runs the normalized ASKAT method on a given genomic region. Rank-based normalization is applied to the phenotype residauls under the null model, after adjusting for covariate effects

Usage

NormalizedASKAT.region(y = NULL, X = NULL, Phi = NULL, type = "bed", filename = NULL, map = NULL, chr = 0, startpos = 0, endpos = 0, regionname = NULL, U = NULL, S = NULL, RH.Null = NULL, weights = NULL)

Arguments

vector of phenotype data (one entry per individual), of length $n$.

matrix of covariates including intercept (dimension: $n \times p$, with $p$ the number of covariates)

Phi

Relationship matrix (i.e. twice the kinship matrix); an $n \times n$ square symmetric positive-definite matrix.

type

character, 'ped', 'bed' (default) or 'shapeit-haps' format of input file containing haplotype data

filename

character, path to input file containing haplotype data

map

object, data.frame contains 3 columns: rsID, chromosome, position in bp as output by e.g. readMapFile.

chr

character, chromosome number (basically from 1 to 22 as used by Plink), on which the region of interest is located

startpos

numeric, start position (in bp, base pairs) of the region of interest (default: 0)

endpos

numeric, end position (in bp, base pairs) of the region of interest (default: 0)

regionname

(character) Name of the region/gene on which you are running the association test. This name is used in the output of this function and can be used to distinguish different regions if this function is run multiple times.

(optional) Matrix of Eigenvectors of the relationship matrix obtained from spectral decomposition of the relationship matrix: $\Phi = U S U^T$. If this parameter is not given, it will be computed, so when running this function for many regions time can be saved by specifying not only Phi, but also S and U.

(optional) Matrix of Eigenvalues of the relationship matrix obtained from spectral decomposition of the relationship matrix: $\Phi = U S U^T$. If this parameter is not given, it will be computed, so when running this function for many regions, time can be saved by specifying not only Phi, but also S and U.

RH.Null

(optional) output of Estim.H0.NormalizedASKAT function. Practically, you don't need to calculate th enull hypothesis for every region. One estimation per trait is enough.

weights

optional numeric vector of genotype weights. If this option is not specified, the beta distribution is used for weighting the variants, with each weight given by $w_i = dbeta(f_i, 1, 25)^2$, with $f_i$ the minor allele frequency (MAF) of variant $i$. This default is the same as used by the SKAT package. This vector is used as the diagonal of the $m \times m$ matrix $W$, with $m$ the number of variants.

Value

A data frame containing the results of the association test. The data frame contains the following columns:

Score.Test: the score of the given association test
P.value: the p-value of the association test
N.Markers: the number of markers in the region
regionname: Name of the region/gene on which you are running the association test