Learn R Programming

LedPred (version 1.6.0)

mapFeaturesToCRMs: R interface to bed_to_matrix REST in server

Description

The mapFeaturesToCRMs function allows the user to create a training set matrix to build a predictive model. The training set is composed of positive regions (known to be involved in the pathway of interest) and negative regions (randomly picked or known to not be involved in the pathway of interest) that will be described (scored) by features. Three types of features file format are accepted: Position specific scoring matrices modeling motifs recognised by transcription factors, bed files containing region coordinates for any discrete feature (NGS peaks, conservation blocks) and wig/bigWig files containing signal data. This script has been tested with version 0.99 of the online server. Go here to see current version of the server http://ifbprod.aitorgonzalezlab.org/map_features_to_crms.php

Usage

mapFeaturesToCRMs(URL = "http://ifbprod.aitorgonzalezlab.org/map_features_to_crms.php", positive.bed = NULL, genome = NULL, negative.bed = NULL, shuffling = NULL, background.seqs = NULL, genome.info = NULL, pssm = NULL, background.freqs = NULL, ngs = NULL, bed.overlap = NULL, my.values = NULL, feature.ranking = NULL, feature.nb = NULL, crm.feature.file = NULL, stderr.log.file = NULL, stdout.log.file = NULL)

Arguments

URL
URL of the server REST target
positive.bed
Positive bed file path. Compulsory
genome
Genome code, eg. dm3 for Drosophila Melanogaster. Compulsory
negative.bed
Negative bed file path.
shuffling
Integer with number of time shuffle background sequences (background.seqs). If negative.bed is NULL and shuffling is set at 0, the feature matrix does not contain negative sequences. It is useful to produce a test set matrix.
background.seqs
Background sequences used for shuffling. If shuffling = 0, set this parameter at 0.
genome.info
File require for shuffling bed. If shuffling = 0, set this parameter at 0.
pssm
Position specific scoring matrices
background.freqs
Background frequencies of nucleotides in genome
ngs
NGS (bed and wig) files
bed.overlap
Minimal overlap as a fraction of query sequence with NGS bed peak. Equivalent with intersectBed -f argument. Default 1bp.
my.values
Bed file where fourth column are values to append to the SVM matrix
feature.ranking
File with ranked features (Output of rankFeatures). It is used for scoring a query bed file
feature.nb
Integer with feature.nb
crm.feature.file
Path to feature matrix file
stderr.log.file
Path to error log
stdout.log.file
Path to standard output log

Value

A list
feature.matrix
a data frame where each row is a region and each column a feature, each cell carry a score, the first column is the response vector
stdout.log
Standard output log of mapFeaturesToCRMs script in server
stderr.log
Standard error log of mapFeaturesToCRMs script in server

Examples

Run this code
## Not run: 
#  dirPath <- system.file("extdata", package="LedPred")
#  file.list <-   list.files(dirPath, full.names=TRUE)
#  background.freqs <- file.list[grep("freq", file.list)]
#  positive.regions <-  file.list[grep("positive", file.list)]
#  negative.regions <-  file.list[grep("negative", file.list)]
#  TF.matrices <-  file.list[grep("tf", file.list)]
#  ngs.path <- system.file("extdata/ngs", package="LedPred")
#  ngs.files=list.files(ngs.path, full.names=TRUE)
#  crm.features.list <- mapFeaturesToCRMs(positive.bed=positive.regions,
#      negative.bed=negative.regions,  background.freqs=background.freqs,
#      pssm=TF.matrices, genome="dm3", ngs=ngs.files,
#      crm.feature.file = "crm.features.tab",
#      stderr.log.file = "stderr.log", stdout.log.file = "stdout.log")
#  names(crm.features.list)
#  class(crm.features.list$crm.features)
#  crm.features.list$stdout.log
#  crm.features.list$stderr.log
# ## End(Not run)

Run the code above in your browser using DataLab