Usage
REGENT.model(AnalysisName,LocusFile=NULL,EnvFile=NULL
,prev=0.001,cv=0.05,alpha=0.05,sims=100000
,indsims=100000,SmallSampAdjust=0.5,BaseRange=0.01
,PlotMax=5,Block=100)
Arguments
AnalysisName
String, must be provided. Output files will be named according to this argument. Running multiple analyses with the same name will cause previous files to be overwritten.
LocusFile
File path string. Location of file containing table of SNP input data. Required columns should have headers SNP, MAF, Ncase, Ncontrol. Risks should either be provided in one column with header RR, or two columns with headers RR_het and RR_hom. Other columns may be present but will not be used in the analysis. Each SNP is a row. Additional columns may be provided but will be ignored.
EnvFile
File path string. Location of file containing table of environmental risk data. Required columns should have headers Factor, Exposure, RR, SE. If multiple exposure levels exist, then the columns should be named Factor, RR1, Exposure1, SE1, RR2, Exposure2, SE2, etc. Each factor is a row. Additional columns may be provided but will be ignored
prev
Prevalance of the disease or trait. Default 0.001.
cv
Coefficient of variation. Default 0.05.
alpha
One minus the desired width of confidence intervals around multilocus risk estimates. Default 0.05 giving 95 percent confidence intervals.
sims
Number of simulations to perform for each single factor risk estimate, for obtaining confidence intervals. Default 100000.
indsims
Number of individuals in the simulated population, for obtaining multilocus genotype frequencies. Default 100000
SmallSampAdjust
Adjustment for small sample sizes, when calculating the standard error of homozygous risk genotypes. Default 0.5
BaseRange
Proportion of population used to calculate the baseline risk (the risk closest to the average in the population). This is to avoid choosing rare, uncertain risk estimates by chance. Default 0.01.
PlotMax
Value at which to truncate the Y-axis of risk distribution plots. High risks are typically rare and of less interest when assessing the distribution in the population. Default 5.
Block
Number of multilocus genotypes held in memory during confidence interval calculation. Higher values should decrease computation time. We advise increasing this substantially (10000+) on high performance systems. Default 100.