Learn R Programming

Haplin (version 6.2.1)

prepPed: Extract family and phenotype information from a ped-format file, to prepare for use in Haplin

Description

Creates a pedIndex file containing family information, a phenotype file, and optionally a ``dummy'' map file. The files are used by GenABEL when loading data into R, and by Haplin when converting from a GenABEL file to a Haplin file.

Usage

prepPed(pedfile, outdir, create.map = F, ask = T)

Arguments

pedfile

A character string giving the name and path of the ped-format file to be used.

outdir

The directory where the pedIndex file, phenotype file, and optionally the map file should be saved.

create.map

Logical. If "TRUE", prepPed creates a dummy map file which can be used by GenABEL when loading data into R. Can be used if no map file is available.

ask

Logical. Default is "TRUE". If set to "FALSE", already existing output files will be overwritten without asking.

Value

There is no useful output; the task of prepPed is to save the extracted information in the outdir directory.

Details

To use Haplin on a large ped-format file, it should first be converted to a GenABEL raw file and loaded into R. Since GenABEL does not retain family information available in the ped file, prepPed should first be run on the file to extract the necessary family and phenotype information. prepPed stores family information in a .pedIndex file with the same name as the ped file, and saves it in the outdir directory. Similarly, it creates a phenotype file (.ph), which contains the individual ID, the sex variable, and the case-control status. Optionally, it can construct a simple .map file, which can be used in situations where no real map file (corresponding to the ped file) is available.

The format of the ped file should be something like this:

1104  1104-1  1104-2  1104-3  1  0  A  B  B  B
1104  1104-2       0       0  1  0  B  B  A  B
1104  1104-3       0       0  2  0  A  B  A  B
1105  1105-1  1105-2  1105-3  2  1  B  B  A  A
1105  1105-2       0       0  1  1  B  B  A  A
1105  1105-3       0       0  2  1  0  0  A  A

The column values are: Family id, Individual id, Father's id, Mother's id, Sex (1 = male, 2 = female), and Case-control status (0 = controls, 1 = cases).

Column 7 and onwards contain the genotype data, with alleles in separate columns, or joined, as AB BB, etc. A ``0'' is used to denote missing data.

Missing values in the sex and case-control columns are not accepted.

References

Gjessing HK and Lie RT. Case-parent triads: Estimating single- and double-dose effects of fetal and maternal disease gene haplotypes. Annals of Human Genetics (2006) 70, pp. 382-396. Web Site: http://folk.uib.no/gjessing/genetics/software/haplin/

See Also

convert.snp.ped, load.gwaa.data

Examples

Run this code
# NOT RUN {
# }
# NOT RUN {
# Create the files mygwas.pedIndex, mygwas.ph and mygwas.map in the "data" directory
prepPed(pedfile = "data/mygwas.ped", outdir = "data", create.map = T)

# }
# NOT RUN {
# }

Run the code above in your browser using DataLab