Learn R Programming

rKOMICS (version 1.3)

msc.length: Length of minicircles

Description

The msc.length function allows you to check the length of minicircle sequences based on a single FASTA file. This function helps determine the size distribution of minicircle sequences.

Usage

msc.length(file, samples, groups)

Value

length

a numerical vector containing the lengths of the minicircle sequences. Each element corresponds to the length of a specific minicircle sequence.

plot

a histogram that visualizes the frequency distribution of minicircle sequence lengths. The histogram provides an overview of the length distribution of the minicircles.

Arguments

file

the name of the FASTA file that contains all the minicircle sequences. The file should be in the format "all.minicircles.circ.fasta".

samples

a character vector containing the sample names.

groups

a vector of the same length as the samples, specifying the groups (e.g., subspecies) to which the samples belong.

Examples

Run this code
require(ggplot2)
require(ggpubr)

### run function
bf <- msc.length(file = system.file("extdata", "all.minicircles.fasta", package="rKOMICS"),
                 samples = exData$samples, groups = exData$subspecies)
af <- msc.length(file = system.file("extdata", "all.minicircles.circ.fasta", package="rKOMICS"),
                 samples = exData$samples, groups = exData$subspecies)

length(which(bf$length<800)) 
length(which(bf$length>1400)) 

### visualize results
hist(af$length, breaks=50)

### alter plot
ggarrange(bf$plot + labs(caption = "Before filtering"), 
          af$plot + labs(caption = "After filtering"), nrow=2)


Run the code above in your browser using DataLab