This function takes as input the genotype of SNPs (GENO), the SEX (SEX), and a quantitative trait (Y) in a sample population, and possibly additional covariates, such as principal components. The function returns the scale association p-values for each SNP.
scaleReg(
GENO,
Y,
COVAR = NULL,
SEX = NULL,
Xchr = FALSE,
transformed = FALSE,
loc_alg = "LAD",
related = FALSE,
cov.structure = "corCompSymm",
clust = NULL,
genotypic = FALSE,
origLev = FALSE,
centre = "median"
)a list of a genotype matrix/vector of SNPs, must contain values 0, 1, 2's coded for the number of reference allele. Alternatively, for imputed genotypes, it could either be a vector of dosage values between 0 and 2, or a list of matrix of genotype probabilities, numerically between 0 and 1 for each genotype. The length/dimension of GENO should match that of Y, and/or SEX and COVAR.
a vector of quantitative traits, such as human height.
optional: a vector or matrix of covariates that are used to reduce bias due to confounding, such as age.
optional: the genetic sex of individuals in the sample population, must be a vector of 1 and 2 following the default sex code is 1 for males and 2 for females in PLINK.
a logical indicator for whether the analysis is for X-chromosome SNPs.
a logical indicating whether the quantitative response Y should be transformed using a rank-based method to resemble a normal distribution; recommended for traits with non-symmetric distribution. The default option is FALSE.
a character indicating the type of algorithm to compute the centre in stage 1; the value is either "OLS", corresponding to an ordinary linear regression under Gaussian assumptions to compute the mean, or "LAD", corresponding to a quantile regression to compute the median. The recommended default option is "LAD". For the quantile regression, the function calls quantreg::rq and the median is estimated using either the "br" (smaller samples) or "sfn" (larger samples and sparse problems) algorithm depending the sample size, for more details see ?quantreg::rq.
optional: a logical indicating whether the samples should be treated as related; if TRUE while no relatedness covariance information is given, it is then estimated under a cov.structure and assumes this structure among all within-group errors pertaining to the same pair/cluster if specified using clust. This option currently only applies to autosomal SNPs.
optional: should be one of standard classes of correlation structures listed in corClasses from R package nlme. See ?corClasses. The most commonly used option is corCompSymm for a compound symmetric correlation structure. This option currently only applies to autosomal SNPs.
optional: a factor indicating the grouping of samples; it should have at least two distinct values. It could be the family ID (FID) for family studies. This option currently only applies to autosomal SNPs.
a logical indicating whether the variance homogeneity should be tested with respect to an additively (linearly) coded or non-additively coded geno_one. The former has one less degree of freedom than the latter and is the default option. For dosage genotypes without genotypic probabilities, genotypic is forced to be FALSE.
a logical indicator for whether the reported p-values should also include original Levene's test.
a character indicating whether the absolute deviation should be calculated with respect to "median" or "mean" in the traditional sex-specific and Fisher combined Levene's test p-values (three tests) for X-chromosome. The default value is "median". This option applies to sex-specific analysis using original Levene's test (i.e. when regression$$=$$TRUE).
a vector of Levene's test regression p-values according to the models specified.
Deng WQ, Mao S, Kalnapenkis A, Esko T, Magi R, Pare G, Sun L. (2019) Analytical strategies to include the X-chromosome in variance heterogeneity analyses: Evidence for trait-specific polygenic variance structure. Genet Epidemiol. 43(7):815-830. 10.1002/gepi.22247. PMID:31332826.
Gastwirth JL, Gel YR, Miao W. (2009). The Impact of Levene's Test of Equality of Variances on Statistical Theory and Practice." Statistical Science. 24(3) 343-360, 10.1214/09-STS301.
Soave D, Sun L. (2017). A generalized Levene's scale test for variance heterogeneity in the presence of sample correlation and group uncertainty. Biometrics. 73(3):960-971. 10.1111/biom.12651. PMID:28099998.
# NOT RUN {
N <- 1000
genoDAT <- rbinom(N, 2, 0.3)
sex <- rbinom(N, 1, 0.5)+1
Y <- rnorm(N)
covar <- matrix(rnorm(N*10), ncol=10)
# vanilla example:
scaleReg(GENO=list(genoDAT, genoDAT), Y=Y, COVAR=covar)
scaleReg(GENO=list(genoDAT, genoDAT), Y=Y, COVAR=covar, genotypic=TRUE)
scaleReg(GENO=list(genoDAT, genoDAT), Y=Y, COVAR=covar, origLev = TRUE)
scaleReg(GENO=list(genoDAT, genoDAT), Y=Y, COVAR=covar, origLev = TRUE, SEX=sex)
# }
Run the code above in your browser using DataLab