Learn R Programming

poolfstat (version 3.1.0)

data.merge: Merge two pooldata or countdata objects

Description

Merge two pooldata or countdata objects

Usage

data.merge(x1, x2, fake.pool.size = 1e+06, verbose = TRUE)

Value

A new `pooldata` or `countdata` object, depending on the input types.

Arguments

x1

First pooldata or countdata object to merge

x2

Second pooldata or countdata object to merge

fake.pool.size

Specifies the haploid sample size used when merging a `countdata` object with a `pooldata` object to create a pseudo pooldata object containing all samples (default = 1e6), see details.

verbose

If TRUE return some information

Details

This function merges two objects of class `pooldata` and/or `countdata`, automatically checking their structure for consistency. The merging behavior depends on the relationship between sample names and SNP identifiers:

1. Merging different samples (same SNPs): If SNP names are identical but (pool or population) sample names differ, the function merges data from the distinct samples into a single `pooldata` or `countdata` object that includes all samples.

2. Merging different SNPs (same sample): If sample names are identical but SNP names differ, the SNP data from each object are merged for each shared sample, effectively combining the variant information into one object.

3. Merging a `countdata` object with a `pooldata` object: In this case, the function returns a `pooldata` object. Allele counts from the `countdata` object are converted into pseudo read counts. To ensure compatibility, the haploid sample size for the sample originally contained in the `countdata` object is set to the value specified by the `fake.haploid.size` argument (default = 1e6). Setting this value to a very large number (as in the default) ensures that each read count is treated as originating from a distinct haploid individual— mimicking Pool-Seq data where read coverage is much lower than the haploid sample size. This effectively disables Pool-Seq-specific bias corrections in downstream statistical analyses. Importantly, when merging objects of different types, only SNP-level merging is permitted. In this context, population samples are indeed expected to be necessarily distinct (at least in terms of effective haploid sample sizes).

See Also

To obtain description of the `countdata` and `pooldata` objects, see countdata and pooldata

Examples

Run this code
 make.example.files(writing.dir=tempdir())
 pooldata1=popsync2pooldata(sync.file=paste0(tempdir(),"/ex.sync.gz"),poolsizes=rep(50,15))
 pooldata2=pooldata1
 #Merge pooldata1 and pooldata2 by SNP
 pooldata2@poolnames=paste0(pooldata2@poolnames,"_2") #pool names must be different
 data.merged=data.merge(pooldata1,pooldata2)
 #Merge pooldata1 and pooldata2 by POP
 pooldata2=pooldata1
 pooldata2@snp.info[,1]=paste0(pooldata2@snp.info[,1],"_2") #SNP info must be different
 data.merged=data.merge(pooldata1,pooldata2)  
 #Merge pooldata1 with a countdata object
 #create a countdata object (NOTE: This example is just for the sake of illustration)
 pooldata2genobaypass(pooldata=pooldata1,writing.dir=tempdir())
 countdata=genobaypass2countdata(genobaypass.file=paste0(tempdir(),"/genobaypass")) 
 countdata@snp.info=pooldata1@snp.info
 countdata@popnames=paste0(countdata@popnames,"_2") #pop names must be different
 data.merged=data.merge(pooldata1,countdata)  

Run the code above in your browser using DataLab