- obj
a class glmmkin or class glmmkin.multi object, returned by fitting the null GLMM using glmmkin
.
- infile
the input file name or an object of class SeqVarGDSClass. Note that for plink binary genotype files only the prefix without .bed, .bim or .fam should be used. Only SNP major mode recognized in the binary file. Alternatively, it can be the full name of a BGEN file (including the suffix .bgen), a GDS file (including the suffix .gds), or a plain text file with some delimiters (comma, space, tab or something else), with one row for each SNP and one column for each individual. In that case, SNPs should be coded as numeric values (0/1/2 or dosages allowed, A/C/G/T coding is not recognized). There can be additional rows and columns to skip at the beginning. The order of individuals can be different from obj
in the null GLMM (see the argument select
). Some compressed files (.gz and .bz2) also allowed. If infile
is an object of class SeqVarGDSClass, the .gds file will be closed upon successful completion of the function.
- outfile
the output file name.
- BGEN.samplefile
path to the BGEN sample file. Required when the BGEN file does not contain sample identifiers or the select
parameter is NULL (default = NULL).
- center
a logical switch for centering genotypes before tests. If TRUE, genotypes will be centered to have mean 0 before tests, otherwise raw values will be directly used in tests (default = TRUE).
- select
an optional vector indicating the order of individuals in infile
. If supplied, the length must match the number of individuals in infile
(default = NULL). Individuals to be excluded should be coded 0. For example, select = c(2, 3, 1, 0)
means the 1st individual in infile
corresponds to the 2nd individual in obj
, the 2nd individual in infile
corresponds to the 3rd individual in obj
, the 3rd individual in infile
corresponds to the 1st individual in obj
, the 4th individual in infile
is not included in obj
. If there are any duplicated id_include
in obj
(longitudinal data analysis), indices in select
should match the order of individuals with unique id_include
in obj
. For plink binary genotype files and GDS files, this argument is not required and the sample ID's are automatically matched.
- MAF.range
a numeric vector of length 2 defining the minimum and maximum minor allele frequencies of variants that should be included in the analysis (default = c(1e-7, 0.5)).
- miss.cutoff
the maximum missing rate allowed for a variant to be included (default = 1, including all variants).
- missing.method
method of handling missing genotypes. Either "impute2mean" or "omit" (default = "impute2mean").
- nperbatch
an integer for how many SNPs should be tested in a batch (default = 100). The computational time can increase dramatically if this value is either small or large. The optimal value for best performance depends on the user's system.
- tol
the threshold for determining monomorphism. If a SNP has value range less than the tolerance, it will be considered monomorphic and its association test p-value will be NA (default = 1e-5). Only used when infile
is a plain text file (or compressed .gz or .bz2 file).
- infile.nrow
number of rows to read in infile
, including number of rows to skip at the beginning. If NULL, the program will determine how many rows there are in infile
automatically and read all rows (default = NULL). Only used when infile
is a plain text file (or compressed .gz or .bz2 file).
- infile.nrow.skip
number of rows to skip at the beginning of infile
. Must be nonnegative integers. Useful when header or comment lines are present (default = 0). Only used when infile
is a plain text file (or compressed .gz or .bz2 file).
- infile.sep
delimiter in infile
(default = "\t"). Only used when infile
is a plain text file (or compressed .gz or .bz2 file).
- infile.na
symbol in infile
to denote missing genotypes (default = "NA"). Only used when infile
is a plain text file (or compressed .gz or .bz2 file).
- infile.ncol.skip
number of columns to skip before genotype data in infile
. These columns can be SNP name, alleles and/or quality measures and should be placed at the beginning in each line. After skipping these columns, the program will read in genotype data and perform score tests. Must be nonnegative integers. It is recommended that SNP name should be included as the first column in infile
and genotype data should start from the second column or later (default = 1). Only used when infile
is a plain text file (or compressed .gz or .bz2 file).
- infile.ncol.print
a vector indicating which column(s) in infile
should be printed to the output directly. These columns can be SNP name, alleles and/or quality measures placed at the beginning in each line. Must be nonnegative integers, no greater than infile.ncol.skip
and sorted numerically in ascending order. By default, it is assumed that the first column is SNP name and genotype data start from the second column, and SNP name should be carried over to the output (default = 1). Should be set to NULL if infile.ncol.skip
is 0. Only used when infile
is a plain text file (or compressed .gz or .bz2 file).
- infile.header.print
a character vector indicating column name(s) of column(s) selected to print by infile.ncol.print
(default = "SNP"). Should be set to NULL if infile.ncol.skip
is 0. Only used when infile
is a plain text file (or compressed .gz or .bz2 file).
- is.dosage
a logical switch for whether imputed dosage should be used from a GDS infile
(default = FALSE).
- ncores
a positive integer indicating the number of cores to be used in parallel computing (default = 1).
- verbose
a logical switch for whether a progress bar should be shown for a GDS infile
(default = FALSE).