Learn R Programming

shazam (version 0.1.2)

summarizeBaseline: Calculate BASELINe summary statistics

Description

summarizeBaseline calculates BASELINe statistics such as the selection strength (Sigma), the 95% confidence intervals and P-values.

Usage

summarizeBaseline(baseline, returnType = c("baseline", "df"), nproc = 1)

Arguments

baseline
Baseline object returned by calcBaseline containing annotations and BASELINe posterior probability density functions (PDFs) for each sequence.
returnType
One of c("baseline", "df") defining whether to return a Baseline object ("baseline") with an updated stats slot or a data.frame ("df") of summary statistics.
nproc
number of cores to distribute the operation over. If nproc = 0 then the cluster has already been set and will not be reset.

Value

Either a modified Baseline object or data.frame containing the BASELINe selection strength, 95% confidence intervals and P-value.

See Also

Other selection analysis functions: calcBaseline, groupBaseline, plotBaselineDensity, plotBaselineSummary

Examples

Run this code
# Subset example data
db <- subset(InfluenzaDb, CPRIMER %in% c("IGHA","IGHM") & 
                          BARCODE %in% c("RL016","RL018","RL019","RL021"))

# Calculate BASELINe
# By default, calcBaseline collapses the sequences in the db by the column "CLONE",
# calculates the numbers of observed mutations and expected frequencies of mutations,
# as defined in the IMGT_V_NO_CDR3 and using the HS5FModel targeting model.
# Then, it calculates  the BASELINe posterior probability density functions (PDFs) for
# sequences in the updated db files; using the focused test statistic
db_baseline <- calcBaseline(db, 
                            sequenceColumn="SEQUENCE_IMGT",
                            germlineColumn="GERMLINE_IMGT_D_MASK", 
                            testStatistic="focused",
                            regionDefinition=IMGT_V_NO_CDR3,
                            targetingModel = HS5FModel,
                            nproc = 1)

# Grouping the PDFs by the BARCODE and CPRIMER columns in the db, corresponding 
# respectively to sample barcodes and the constant region isotype primers.
baseline_group <- groupBaseline(db_baseline, groupBy=c("BARCODE", "CPRIMER"))

# Get a data.frame of the summary statistics
baseline_stats <- summarizeBaseline(baseline_group, returnType="df")
                     

Run the code above in your browser using DataLab