sARTP
, rARTP
. It will be set by function options.default
by default.out.dir
getwd
. id.str
seed
Options for testing an association:
method
nperm
nthread
detectCores()
to use all available processors. Options for controlling data cleaning:
snp.miss.rate
snp.miss.rate
will be removed from the analysis. The default is 0.05. maf
maf
will be removed from the analysis. The default is 0.05. HWE.p
HWE.p
will be removed from the analysis. The test is applied to the genotype data or reference data. The test is ignored if the imputed genotype are not encoded as 0/1/2. The default is 1E-5. gene.R2
cor
function will be called to compute the R^2 values between each pair of SNPs and remove one SNP with lower MAF in each pair with R^2 greater than gene.R2
. The default is 0.95. chr.R2
cor
function will be called to compute the R^2 values between each pair of SNPs and remove one SNP with lower MAF in each pair with R^2 greater than chr.R2
. The default is 0.95. gene.miss.rate
gene.miss.rate
will be removed from the analysis. The missing rate is calculated as the number of subjects with at least one missing genotype among all SNPs in the gene divided by the total number of subjects. The default is 1.0. rm.gene.subset
TRUE
to remove genes which are subsets of other genes. The default is TRUE
. turn.off.filters
TRUE
, it is equivalent to set snp.miss.rate = 1
, maf = 0
, trim.huge.chr
, gene.R2 = 1
, chr.R2 = 1
, huge.gene.R2 = 1
, huge.chr.R2 = 1
, and HWE.p = 0
. The default is FALSE
. impute
TRUE
to impute missing genotypes with the mean of a SNP. FALSE
to use another way other than imputation to handle missing data when constructing the score statistics, which is considered to be more power but also more time-consuming. The default is FALSE
. If the pathway is large and the missing rates are expected to be low, consider to set it to be TRUE
manually for reducing computational burden. It could be beneficial in terms of power with impute
set as FALSE
if the missing rate is high, e.g., the data are combined from multiple studies, and a SNP has missing genotypes because it is not measured or successfully imputed in some of the participating studies. group.gap
NULL
, i.e., regrouping is not performed. delete
TRUE
to delete temporary files containing the test statistics for each gene. The default is TRUE
. print
TRUE
to print information to the console. The default is TRUE
. tidy
deleted.snps
in the returned object of sARTP
containing information of SNPs excluded from the analysis and their reasons. Possible reason codes include RM_BY_SNP_NAMES
, RM_BY_REGIONS
, NO_SUM_STAT
, NO_RAW_GENO
, NO_REF
, SNP_MISS_RATE
, SNP_LOW_MAF
, SNP_CONST
, SNP_HWE
, GENE_R2
, HUGE_GENE_R2
, CHR_R2
, HUGE_CHR
, HUGE_CHR2
, HUGE_CHR3
, GENE_MISS_RATE
, GENE_SUBSET
, CONF_ALLELE_INFO
, LACK_OF_ACCU_BETA
. Set tidy
as TRUE
to hide the SNPs with codes NO_SUM_STAT
and NO_REF
. The default is TRUE
. save.setup
TRUE
to save necessary data, e.g., working options, observed scores and covariance matrix, to local to repeat the analysis more quicly (skip loading and filtering data). It will be set to be TRUE
if only.setup
is TRUE
. The default is FALSE
. path.setup
warm.start
if save.setup
is TRUE
. The default is NULL
so that it is set as paste(out.dir, "/setup.", id.str, ".rda", sep = "")
. only.setup
TRUE
if only the setup is needed while the testing procedure is not. The R code to create the setup uses single thread but the testing procedure can be multi-threaded. The best practice to use ARTP2
on a multi-threaded cluster is to firstly create the setup in single-thread mode, and then call the warm.start
to compute the p-values in multiple-thread mode, which uses the saved setup at path.setup
as input. save.setup
will be set to be TRUE
if only.setup
is TRUE
. The default is FALSE
. keep.geno
TRUE
if the reference genotypes of SNPs in pathway is returned. The default is FALSE
. excluded.snps
NULL
if no SNP is excluded. The default is NULL
. selected.snps
NULL
if all SNPs are selected but other filters may be applied. The default is NULL
. excluded.regions
Chr
, Start
, End
, or three columns Chr
, Pos
, Radius
. The unit is base-pair (bp). SNPs within [Start, End]
or [Pos - Radius, Pos + Radius]
will be excluded. See Examples
in sARTP
. This option is only available for sARTP
. The default is NULL
. excluded.subs
fam
files in reference
. The default is NULL
. selected.subs
fam
files in reference
. The default is NULL
. excluded.genes
NULL
if no gene is excluded. The default is NULL
. meta
TRUE
if return meta-analysis summary data from sARTP
. The default is FALSE
. Options for handling huge pathways:
trim.huge.chr
TRUE
the additional options below are in effect. The default is TRUE
. huge.gene.size
huge.gene.size
will be further trimmed with huge.gene.R2
if trim.huge.chr
is TRUE
. The default is 1000. huge.chr.size
huge.chr.size
will be further trimmed with huge.chr.R2
if trim.huge.chr
is TRUE
. The default is 2000. huge.gene.R2
gene.R2
. The default is gene.R2
- 0.05. huge.chr.R2
chr.R2
. The default is chr.R2
- 0.05. Options for gene-based test:
inspect.snp.n
Details
) inspect.snp.percent
x
between 0 and 1 such that a truncation point will be defined at every x
percent of the top SNPs. The default is 0 so that the truncation points will be 1:inspect.snp.n
. (See Details
) Options for pathway-based test:
inspect.gene.n
inspect.gene.percent
x
between 0 and 1 such that a truncation point will be defined at every x
percent of the top genes. If 0 then the truncation points will be 1:inspect.gene.n
. The default is 0.05. excluded.snps
and selected.snps
if non-NULL. Code: RM_BY_SNP_NAMES
.
2. Apply the option excluded.regions
if non-NULL and if sARTP
is used. Code: RM_BY_REGIONS
.
2. Remove SNPs without summary statistics in summary.files
. Code: NO_SUM_STAT
; or remove SNPs without raw genotype data in data
or geno.files
. Code: NO_RAW_GENO
.
3. Remove SNPs not in bim
files in reference
if sARTP
is used. Code: NO_REF
.
4. Remove SNPs with conflictive allele information in summary and reference data if sARTP
is used. Code: CONF_ALLELE_INFO
.
5. Remove SNPs with high missing rate. Code: SNP_MISS_RATE
.
6. Remove SNPs with low MAF. Code: SNP_LOW_MAF
.
7. Remove constant SNPs. Code: SNP_CONST
.
8. Remove SNPs fail to pass HWE test. Code: SNP_HWE
.
9. Remove highly correlated SNPs within each gene. Code: GENE_R2
or HUGE_GENE_R2
.
10. Remove highly correlated SNPs within each chromosome. Code: CHR_R2
, HUGE_CHR
, HUGE_CHR2
or HUGE_CHR3
.
11. Remove genes with high missing rate. Code: GENE_MISS_RATE
.
12. Remove genes which are subsets of other genes. Code: GENE_SUBSET
.Example truncation points defined by inspect.snp.n
and inspect.snp.percent
:
Assume the number of SNPs in a gene is 100. Below are examples of the truncation points for different values of inspect.snp.n
and inspect.snp.percent
. Similar values are applied to inspect.gene.n
and inspect.gene.percent
.
inspect.snp.n | inspect.snp.percent | truncation points |
1 | 0 | 1 |
1 | 0.05 | 5 |
1 | 0.25 | 25 |
1 | 1 | 100 |
2 | 0 | 1, 2 |
2 | 0.05 | 5, 10 |
2 | 0.25 | 25, 50 |
2 | 1 | 100 |
options.default
options <- options.default()
str(options)
names(options)
Run the code above in your browser using DataLab