Learn R Programming

coMET (version 1.2.0)

comet: Visualize EWAS results in a genomic region of interest

Description

coMET is an R-based package to visualize EWAS (epigenome-wide association scans) results in a genomic region of interest. The main feature of coMET is to plot the the significance level of EWAS results in the selected region, along with correlation in DNA methylation values between CpG sites in the region. The coMET package generates plots of phenotype-association, co-methylation patterns, and a series of annotation tracks.

Usage

comet(mydata.file = NULL, mydata.format = "site", mydata.type = "file", mydata.large.file = NULL, mydata.large.format = "site", mydata.large.type = "listfile", cormatrix.file = NULL, cormatrix.method = "spearman", cormatrix.format = "raw", cormatrix.color.scheme = "bluewhitered",cormatrix.conf.level=0.05, cormatrix.sig.level= 1, cormatrix.adjust="none", cormatrix.type = "listfile", mydata.ref = NULL, start = NULL, end = NULL, zoom = FALSE, lab.Y = "log", pval.threshold = 1e-05, disp.pval.threshold = 1, disp.association = FALSE, disp.association.large = FALSE, disp.region = FALSE, disp.region.large = FALSE, symbols = "circle-fill", symbols.large = NA, sample.labels = NULL, sample.labels.large = NULL, use.colors = TRUE , disp.color.ref = TRUE, color.list = NULL, color.list.large = NULL, disp.mydata = TRUE, biofeat.user.file = NULL, biofeat.user.type = NULL, biofeat.user.type.plot = NULL, genome = "hg19", dataset.gene = "hsapiens_gene_ensembl", tracks.gviz = NULL, tracks.ggbio = NULL, tracks.trackviewer = NULL, disp.mydata.names = TRUE, disp.color.bar = TRUE, disp.phys.dist = TRUE, disp.legend = TRUE, disp.marker.lines = TRUE, disp.cormatrixmap = TRUE, disp.pvalueplot =TRUE, disp.type = "symbol", disp.mult.lab.X = FALSE, disp.connecting.lines = TRUE, palette.file = NULL, image.title = NULL, image.name = "coMET", image.type = NULL, image.size = 3.5, font.factor = NULL, symbol.factor = NULL, print.image = TRUE, connecting.lines.factor = 1.5, connecting.lines.adj = 0.01, connecting.lines.vert.adj = -1, connecting.lines.flex = 0, config.file = NULL, verbose = FALSE)

Arguments

mydata.file
Name of the info file describing the coMET parameters
mydata.format
Format of the input data in mydata.file. There are 4 different options: site, region, site_asso, region_asso.
mydata.type
Format of mydata.file. There are 2 different options: FILE or MATRIX.
mydata.large.file
Name of additional info files describing the coMET parameters. File names should be comma-separated. It is optional, but if you add some, they need to be file(s) in tabular format with a header. Additional info file can be a list of CpG sites with/without Beta value (DNA methylation level) or direction sign. If it is a site file then it is mandatory to have the 4 columns as shown below with headers in the same order. Beta can be the 5th column(optional) and it can be either a numeric value (positive or negative values) or only direction sign ("+", "-"). The number of columns and their types are defined but the option mydata.large.format.
mydata.large.format
Format of additional data to be visualised in the p-value plot. Format should be comma-separated. There are 4 different options for each file: site, region, site_asso, region_asso.
mydata.large.type
Format of mydata.large.file. There are 2 different options: listfile or listdataframe.
cormatrix.file
Name of the raw data file or the pre-computed correlation matrix file. It is mandatory and has to be a file in tabular format with an header.
cormatrix.method
Options for calculating the correlation matrix: spearman, pearson and kendall
cormatrix.format
Format of the input cormatrix.file. TThere are two options: raw file (raw if CpG sites are by column and samples by row or raw_rev if CpG site are by row and samples by column) and pre-computed correlation matrix (cormatrix)
cormatrix.color.scheme
Color scheme options: heat, bluewhitered, cm, topo, gray, bluetored
cormatrix.conf.level
Alpha level for the confidence interval. Default value= 0.05. CI will be the alpha/2 lower and upper values.
cormatrix.sig.level
Significant level to visualise the correlation. If the correlation has a pvalue under the significant level, the correlation will be colored in "goshwhite", else the color is related to the correlation level and the color scheme choosen.Default value =1.
cormatrix.adjust
indicates which adjustment for multiple tests should be used. "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".Default value="none"
cormatrix.type
Format of cormatrix.file. There are 2 different options: listfile or listdataframe.
mydata.ref
The name of the referenceomic feature (e.g. CpG-site) listed in mydata.file
start
The first nucleotide position to be visualised. It could be bigger or smaller than the first position of our list of omic features.
end
the last nucleotide position to be visualised. It has to be bigger than the value in the option start, but it could be smaller or bigger than the last position of our list of omic features.
zoom
Default=False
lab.Y
Scale of the y-axis. Options: log or ln
pval.threshold
Significance threshold to be displayed as a red dashed line
disp.pval.threshold
Display only the findings that pass the value put in disp.pval.threshold
disp.association
This logical option works only if mydata.file contains the effect direction (mydata.format=site_asso or region_asso). The value can be TRUE or FALSE: if FALSE (default), for each point of data in the p-value plot, the color of symbol is the color of co-methylation pattern between the point and the reference site; if TRUE, the effect direction is shown. If the association is positive, the color is the one defined with the option color.list. On the other hand, if the association is negative, the color is the opposed color.
disp.association.large
This logical option works only if mydata.large.file contains the effect direction (mydata.large.format=site_asso or region_asso). The value can be TRUE or FALSE: if FALSE (default), for each point of data in the p-value plot, the color of symbol is the color of co-methylation pattern between the point and the reference site; if TRUE, the effect direction is shown. If the association is positive, the color is the one defined with the option color.list.large. On the other hand, if the association is negative, the color is the opposed color.
disp.region
This logical option works only if mydata.file contains regions (mydata.format=region or region_asso). The value can be TRUE or FALSE (default). If TRUE, the genomic element will be shown by a continuous line with the color of the element, in addition to the symbol at the center of the region. If FALSE, only the symbol is shown.
disp.region.large
This logical option works only if mydata.large.file contains regions (mydata.large.format=region or region_asso). The value can be TRUE or FALSE (default). If TRUE, the genomic element will be shown by a continuous line with the color of the element, in addition to the symbol at the center of the region. If FALSE, only the symbol is shown.
symbols
The symbol shown in the p-value plot. Options: circle, square, diamond, triangle. symbols can be filled by appending -fill, e.g. square-fill. Example: circle,diamond-fill,triangle
symbols.large
The symbol to visualise the data defined in mydata.large.file. Options: circle, square, diamond, triangle; symbols can either be filled or not filled by appending -fill e.s., square-fill. Example: circle,diamond-fill,triangle
sample.labels
Labels for the sample described in mydata.file to include in the legend
sample.labels.large
Labels for the sample described in mydata.large.file to include in the legend
use.colors
Use the colors defined or use the grey color scheme
disp.color.ref
Logical option TRUE or FALSE (TRUE default). if TRUE, the connection line related to the reference probe is in purple, if FALSE if the connection line related to the reference probe stay black.
color.list
List of colors for displaying the P-value symbols related to the data in mydata.file
color.list.large
List of colors for displaying the P-value symbols related to the data in mydata.large.file
disp.mydata
logical option TRUE or FALSE. TRUE (default). If TRUE, the P-value plot is shown; if FALSE the plot will be defined by GViz
biofeat.user.file
Name of data file to visualise in the tracks. File names should be comma-separated.
biofeat.user.type
Track type, where multiple tracks can be shown (comma-separated): DataTrack, AnnotationTrack, GeneregionTrack.
biofeat.user.type.plot
Format of the plot if the data are shown with the Gviz's function called DataTrack (comma-separated)
genome
The human genome reference file. e.g. "hg19" for Human genome 19 (NCBI 37), "grch37" (GRCh37),"grch38" (GRCh38)
dataset.gene
The gene names from ENSEMBL. e.g. hsapiens_gene
tracks.gviz
list of tracks created by Gviz.
tracks.ggbio
list of tracks created by ggbio.
tracks.trackviewer
list of tracks created by track viewer.
disp.mydata.names
logical option TRUE or FALSE. If True (default), the names of the CpG sites are displayed.
disp.color.bar
Color legend for the correlation matrix (range -1 to 1). Default: blue-white-red
disp.phys.dist
logical option (TRUE or FALSE). TRUE (default).Display the bp distance on the plots
disp.legend
logical option TRUE or FALSE. TRUE (default) Display the sample labels and corresponding symbols on the lower right side
disp.marker.lines
logical option TRUE or FALSE. TRUE (default), if FALSE the red line for pval.threshold is not shown
disp.cormatrixmap
logical option TRUE or FALSE. TRUE (default), if FALSE correlation matrix is not shown
disp.pvalueplot
logical option (TRUE or FALSE). TRUE (default), if FALSE the pvalue plot is not shown
disp.type
Default: symbol
disp.mult.lab.X
logical option TRUE or FALSE. FALSE (default).Display evenly spaced X-axis labels; up to 5 labels are shown.
disp.connecting.lines
logical option TRUE or FALSE. TRUE (default) displays connecting lines between p-value plot and correlation matrix
palette.file
File that contains color scheme for the heatmap. Colors are hexidecimal HTML color codes; one color per line; if you do not want to use this option, use the color defined by the option cormatrix.color.scheme
image.title
Title of the plot
image.name
The path and the name of the plot file without extension. The extension will be added by coMET depending on the option image.type.
image.type
Options: pdf or eps
image.size
Default: 3.5 inches. Possible sizes : 3.5 or 7
font.factor
Font size of the sample labels. Range: 0-1
symbol.factor
Size of the symbols. Range: 0-1
print.image
Print image in file or not.
connecting.lines.factor
Length of the connecting lines. Range: 0-2
connecting.lines.adj
Position of the connecting lines horizontally. Negative values shift the connecting lines to the left and positive values shift the lines to the right. Range: (-1;1) option -1 means no connecting lines.
connecting.lines.vert.adj
Position of the connecting lines vertically. Can be used to vertically adjust the position of the connecting lines in relation to the CpG-site names. Negative value shift the connecting lines down. Range: (-0.5 - 0), option -1 mean the default value related to the plot size (-0.5 for 3.5 plot size; -0.7 for 7.5 plot size)
connecting.lines.flex
Adjusts the spread of the connecting lines. Range: 0-2
config.file
Configuration file contains the values of these options instead of defining these by command line. It is a file where each line is one option. The name of option and its value are separated by "=". If there are multiple values such as for the option list.tracks or the options for additional data, you need to separated them by a "comma" and not extra space. (i.e. list.tracks=geneENSEMBL,CGI,ChromHMM,DNAse,RegENSEMBL,SNP)
verbose
logical option TRUE or FALSE. TRUE (default). If TRUE, shows comments.

Value

Create a plot in pdf or eps format depending to some options

Details

The function is limited to visualize 120 omic features.

References

http://epigen.kcl.ac.uk/comet/

See Also

comet.web,comet.list

Examples

Run this code
extdata <- system.file("extdata", package="coMET",mustWork=TRUE)
configfile <- file.path(extdata, "config_cyp1b1_zoom_4comet.txt")
myinfofile <- file.path(extdata, "cyp1b1_infofile.txt")
myexpressfile <- file.path(extdata, "cyp1b1_infofile_exprGene_region.txt")
mycorrelation <- file.path(extdata, "cyp1b1_res37_rawMatrix.txt")

chrom <- "chr2"
start <- 38290160
end <- 38303219
gen <- "hg19"

if(interactive()){
    cat("interactive")
    genetrack <-genesENSEMBL(gen,chrom,start,end,showId=TRUE)
    snptrack <- snpBiomart(chrom, start, end, 
                dataset="hsapiens_snp_som",showId=FALSE)
    strutrack <- structureBiomart(chrom, start, end, 
                strand, dataset="hsapiens_structvar_som")
    clinVariant<-ClinVarMainTrack(gen,chrom,start,end)
    clinCNV<-ClinVarCnvTrack(gen,chrom,start,end)
    gwastrack <-GWASTrack(gen,chrom,start,end)
    geneRtrack <-GeneReviewsTrack(gen,chrom,start,end)
    listgviz <- list(genetrack,snptrack,strutrack,clinVariant,
                 clinCNV,gwastrack,geneRtrack)
    comet(config.file=configfile, mydata.file=myinfofile, mydata.type="file",
      cormatrix.file=mycorrelation, cormatrix.type="listfile",
      mydata.large.file=myexpressfile, mydata.large.type="listfile",
      tracks.gviz=listgviz, verbose=FALSE, print.image=FALSE,disp.pvalueplot=FALSE)
} else {
    cat("Non interactive")
    data(geneENSEMBLtrack)
    data(snpBiomarttrack)
    data(ISCAtrack)
    data(strucBiomarttrack)
    data(ClinVarCnvTrack)
    data(clinVarMaintrack)
    data(GWASTrack)
    data(GeneReviewTrack)
    listgviz <- list(genetrack,snptrack,strutrack,clinVariant,
                clinCNV,gwastrack,geneRtrack)
    comet(config.file=configfile, mydata.file=myinfofile, mydata.type="file",
       cormatrix.file=mycorrelation, cormatrix.type="listfile",
        mydata.large.file=myexpressfile,  mydata.large.type="listfile",
        tracks.gviz=listgviz, verbose=FALSE, print.image=FALSE,disp.pvalueplot=FALSE)
}

Run the code above in your browser using DataLab