Learn R Programming

fastcmh (version 0.2.7)

makefastcmhdata: Create sample data for fastcmh

Description

This function creates sample data for use with the runfastcmh method.

Usage

makefastcmhdata(folder = "./", xfilename = "data.txt", yfilename = "label.txt", covfilename = "cov.txt", K = 2, L = 1000, n = 200, noiseP = 0.3, corruptP = 0.05, rho = 0.8, tau1 = 100, taulength1 = 4, tau2 = 200, taulength2 = 4, seednum = 2, truetaufilename = "truetau.txt", showOutput = FALSE, saveToList = FALSE)

Arguments

folder
The folder in which the data will be saved. Default is current directory "./".
xfilename
The name of the data file. Default is "data.txt"
yfilename
The name of the label file. Default is "label.txt"
covfilename
The name of the file containing the covariate categories . This file actually just contains K numbers, where K is the number of covariates. Default is "cov.txt"
K
The number of covariates (a positive integer). Default is K=2.
L
The number of features (length of each sequence). Default is L=1000.
n
The number of samples (cases and controls combined). Default is n=200, i.e. 100 cases and 100 controls.
noiseP
The background noise in the data (as a probability of 0/1 being flipped). Default is noiseP=0.3
corruptP
The probability of data corruption: each bit has probability corruptP of being flipped. Default is corruptP=0.05.
rho
The strength of the confounding in the confounded interval (as a probability). Default is rho=0.8 (i.e. a very strong signal).
tau1
The location of the significant interval (starting point). Default value is tau1=100.
taulength1
The length of the significant interval. Default value is taulength1=4, so default significant interval is [100, 103].
tau2
The location of the confounded significant interval (starting point). Default value is tau2=200.
taulength2
The length of the confounded significant interval. Default value is taulength2=4, so default significant interval is [200, 203].
seednum
The seed used for generating the data. Default value is seednum=2.
truetaufilename
The file where the location of the true significant intervals are saved (as opposed to the detected significant intervals). Default is "truetau.txt".
showOutput
Flag to decide whether or not to show output, where files are created, their names, etc. Default is FALSE, so will save to folder by default. However, all of the examples use saveToList=TRUE in order to avoid writing to file. The list will consist of data, label and cov data frames, when saveToList=TRUE.
saveToList
Flag to decide whether or not to save data to the folder, or to return (output) the data as a list. By default, saveToList=FALSE.

See Also

runfastcmh

Examples

Run this code
#make a small sample data set, using the default parameters
mylist <- makefastcmhdata(showOutput=TRUE, saveToList=TRUE)

#make a very small sample data set
mylist <- makefastcmhdata(n=20, L=10, tau1=2, taulength1=2,
       tau2=6, taulength2=2, saveToList=TRUE)

Run the code above in your browser using DataLab