Simple but fast function for finding peaks in genome-wide association study (GWAS) data based on setting a minimum distance between peaks.
quick_peak(
data,
npeaks = NA,
p_cutoff = 5e-08,
span = 1e+06,
min_points = 2,
chrom = NULL,
pos = NULL,
p = NULL
)
Vector of row indices
GWAS dataset (data.frame or data.table)
Number of peaks to find. If set to NA
, algorithm finds all
distinct peaks separated from one another by region size specified by
span
.
Specifies cut-off for p-value significance above which p-values are ignored.
Minimum genomic distance between peaks (default 1 Mb)
Minimum number of p-value significant points which must lie
within the span of a peak. This removes peaks with single or only a few low
p-value SNPs. To disable set min_points
to 1 or less.
Determines which column in data
contains chromosome
information. If NULL
tries to autodetect the column.
Determines which column in data
contains position information.
If NULL
tries to autodetect the column.
Determines which column in data
contains SNP p-values. If NULL
tries to autodetect the column.
This function is designed for speed. SNP p-values are filtered to only those
which are significant as specified by p_cutoff
. Each peak is identified as
the SNP with the lowest p-value and then SNPs in proximity to each peak
within the distance specified by span
are removed. Regions such as the HLA
whose peaks may well be broader than span
may produce multiple entries.