Learn R Programming

iGC (version 1.2.2)

create_gene_exp: Create an joint gene expression table of all samples

Description

The function reads in all gene expression data given by the sample description sample_desc and return a joint expression table of all samples.

Usage

create_gene_exp(sample_desc, read_fun = NULL, progress = TRUE,
  progress_width = 48, ...)

Arguments

sample_desc
data.table object created by create_sample_desc.
read_fun
Custom reader function, see its own section for more detail.
progress
Whether to display a progress bar. By default TRUE.
progress_width
The text width of the shown progress bar. By default is 48 chars wide.
...
Arguments passed to the custom reader function specified in read_fun.

Value

  • data.table of all samples gene expression, whose rows are gene expression and columns are sample names. First column GENE contains the corresponding gene names.

Custom reader function

Custom reader function is given by read_fun = your_reader_fun. It takes the filepath as the first argument and return a data.table with the first two columns being GENE and Expression of type character and double.

The output joint gene expression table has first column GENE store the gene name, which are are determined by the first sample being evaluated.

Rest arguments of create_gene_exp(...) will be passed to this reader function.

Note: all string-like columns should NOT be of type factor. Remember to set stringsAsFactors = FALSE.

Details

By default it assumes the data to be of TCGA level 3 file format. However, nearly all real world data fail to have the same format as TCGA. In this case, one needs to tell the function how to parse the data by implementing a custom reader function that accepts the filepath as the first argument. See Detail section for full specification. The function naively concatenates all return expression as if all gene expressions are stated in the same gene order as columns in a new data.table.

See Also

read.table and fread for custom reader function implementation; create_sample_desc for creating sample description.

Examples

Run this code
## Use first three samples of the builtin dataset

sample_root <- system.file("extdata", package = "iGC")
sample_desc_pth <- file.path(sample_root, "sample_desc.csv")
sample_desc <- create_sample_desc(
    sample_desc_pth, sample_root=sample_root
)[1:3]

## Define custom reader function for TCGA level 3 data
my_gene_exp_reader <- function(ge_filepath) {
    gene_exp <- read.table(
        ge_filepath,
        header = FALSE, skip = 2,
        na.strings = "null",
        colClasses = c("character", "double")
    )
    dt <- data.table::as.data.table(gene_exp)
    data.table::setnames(dt, c("GENE", "Expression"))
}
gene_exp <- create_gene_exp(
    sample_desc,
    read_fun = my_gene_exp_reader,
    progress_width = 60
)
gene_exp[1:5]

Run the code above in your browser using DataLab