Learn R Programming

sagenhaft (version 1.42.0)

extract.lib: Functions for SAGE library extraction

Description

Functions to extract the tags in a library from sequences or base-caller output.

Usage

extract.lib.from.zip(zipfile, libname=sub(".zip","",basename(zipfile)), ...) extract.lib.from.directory(dirname, libname=basename(dirname), pattern, ...) extract.library.tags(filelist, base.caller.format="phd", remove.duplicate.ditags=TRUE, remove.N=FALSE, remove.low.quality=10, taglength=10, min.ditag.length=(2*taglength-2), max.ditag.length=(2*taglength+4), cut.site="catg", default.quality=NA, verbose=TRUE, ...) reestimate.lib.from.tagcounts(tagcounts, libname, default.quality=20, ...) compute.unique.tags(lib) combine.libs(..., artifacts=c("Linker", "Ribosomal", "Mitochondrial")) remove.sage.artifacts(lib, artifacts=c("Linker","Ribosomal","Mitochondrial"), ...) read.phd.file(file) read.seq.qual.filepair(file, default.quality=NA) extract.ditags(sequence, taglength=10, filename=NA, min.ditag.length=(2*taglength-2), max.ditag.length=(2*taglength+4), cut.site="catg")

Arguments

zipfile,dirname
Name of a ZIP file or a directory that contains base-caller output files
libname
libname a character string to be assigned as library name
pattern
Regular expression to specify pattern for the files that will be read
filelist
List of files to be read
base.caller.format
base.caller.format can be "phd" or "seq" or a character vector of the length of the filelist
remove.duplicate.ditags
Remove duplicate ditags. TRUE or FALSE
remove.N
Remove all tags that contain N. TRUE or FALSE
remove.low.quality
Remove all tags with an average quality score of less than remove.low.quality. Skipped if < 0
taglength
Length of tags. Usually 10 or 17
min.ditag.length,max.ditag.length
Minimum and maximum length for ditags
cut.site
Restriction enzyme cut site. Usually CATG
verbose
Display information during process
lib
Library object
file,filename
Character string indicating file name
default.quality
Quality value to use on sequences, if quality files are missing
sequence
Construct containing sequence and quality values returned by read.phd.file or read.seq.qual.filepair
artifacts
Types of artificially generated tags to remove.
...
Arguments passed on to extraction functions.
tagcounts
Tagcounts from library. Integer Vecotor with Tag sequences as names.

Value

lib returns an SAGE library object.

Details

The functions extract.lib.from.zip or extract.lib.from.directory should be used to extract the SAGE TAGS from the sequences of a library, the sequences need to be provided by the output files from the base caller software either in a ZIP archive or in a directory. These are usually the only functions that should directly be called by the user. The other functions are called by these and should only be used directly by experienced users to get more direct control over the process. Most arguments are passed on and can be specified in the high level functions. Zipfilenames must be specified using relative pathnames!

References

http://tagcalling.mbgproject.org

See Also

sage.library, error.correction

Examples

Run this code
#library(sagenhaft)
#file.copy(system.file("extdata", "E15postHFI.zip",package="sagenhaft"),
#          "E15postHFI.zip")
#E15post<-extract.lib.from.zip("E15postHFI.zip", taglength=10,
#                              min.ditag.length=20, max.ditag.length=24)
#E15post

Run the code above in your browser using DataLab