Learn R Programming

tileHMM (version 1.0-7)

generate.data: Generate Simulated Dataset

Description

Generate simulated data based on real data and the results of a previous analysis.

Usage

generate.data(data, group, pos.range = c(1, 10), num.seq = 100, gap = 35, split.gap = 1000, min.len = 2)

Arguments

data
A data.frame with information about genomic coordinates of probes (chromosome and position) in the first two columns. Subsequent columns contain probe measurements of individual samples.
group
Information that can be used to assign probes to one of two classes. Either a logical vector or the name of a GFF file. In the later case all probes in annotated regions are considered to be ‘positive’.
pos.range
Indicates how many positive regions should be generated for each observation sequence. The actual number for each sequence is sampled uniformly from the indicated range of values.
num.seq
Number of observation sequences to generate.
gap
Gap between probes. Used to generate artificial probe coordinates.
split.gap
Gap between sequences.
min.len
Minimum number of probes per region.

Value

A list with components
observation
A data.frame with the same format as data.
regions
A list of state sequences.