Simple but fast function for finding peaks in genome-wide association study (GWAS) data based on setting a minimum distance between peaks.
quick_peak(
data,
npeaks = NA,
p_cutoff = 5e-08,
span = 1e+06,
min_points = 2,
chrom = NULL,
pos = NULL,
p = NULL
)Vector of row indices
GWAS dataset (data.frame or data.table)
Number of peaks to find. If set to NA, algorithm finds all
distinct peaks separated from one another by region size specified by
span.
Specifies cut-off for p-value significance above which p-values are ignored.
Minimum genomic distance between peaks (default 1 Mb)
Minimum number of p-value significant points which must lie
within the span of a peak. This removes peaks with single or only a few low
p-value SNPs. To disable set min_points to 1 or less.
Determines which column in data contains chromosome
information. If NULL tries to autodetect the column.
Determines which column in data contains position information.
If NULL tries to autodetect the column.
Determines which column in data contains SNP p-values. If NULL
tries to autodetect the column.
This function is designed for speed. SNP p-values are filtered to only those
which are significant as specified by p_cutoff. Each peak is identified as
the SNP with the lowest p-value and then SNPs in proximity to each peak
within the distance specified by span are removed. Regions such as the HLA
whose peaks may well be broader than span may produce multiple entries.