Rolexa (version 1.28.0)

FilterResults: FilterResults

Description

Filter basecalling results to keep only high-quality bases

Usage

"FilterResults"(run=Rolexa.env,results) FilterResults(run,...)

Arguments

run
a RolexaRun object defining the run parameters
results
a results object from SeqScore
...
additional arguments, ignored

Value

FilterResults returns an object suitable for SaveResults

Details

FilterResults filters the sequences according to the entropy thresholds set by IThresholds and applies the tag length cutoff MinimumTagLength.

The algorithm works as follows: for each tag the base entropies are searched for a sub-vector k+1:l such that sum(entropy[n,5+k+1:l])<=IThresholds[l] where l=MinimumTagLength. If such a sub-vector exists, it is then extended in both direction until the total entropy exceeds the threshold: sum(results[n,5+k1:k2])>IThresholds[k2-k1+1].

The tag is then shortened: substr(results[n,5],k1,k2), but [ACGT] bases to left of k1 and to the right of k2 are added. The Barcode first bases of the tags will always be included in a separate column if this parameter has been set. If PET=TRUE then the whole procedure is applied independently to each half of the sequence (and two separate sets of tags and scores are returned) and the barcode (if any) is assumed to be in-between the two paired tags.

References

Probabilistic base calling of Solexa sequencing data, BMC Bioinformatics 2008, 9:431

See Also

readFastq to read fastq files, SeqScore and FilterResults to produce results for SaveResults