Learn R Programming

funbarRF (version 1.0.2)

Fungal Species Identification using DNA Barcode with Random Forest

Description

A machine learning based approach for fungal species identification using barcode sequence data. The multi-class random forest model has been used for prediction purpose, where the gap-pair compositional feature was used to encode the barcode sequence data. The encoded dataset was used as input for prediction purpose. Though this approach has been developed for fungal species identification in particular, can be used for other species identification as well.

Copy Link

Version

Install

install.packages('funbarRF')

Monthly Downloads

3

Version

1.0.2

License

GPL (>= 2)

Maintainer

Prabina Meher

Last Published

May 27th, 2019

Functions in funbarRF (1.0.2)

Unite

UNITE training dataset of 143723 sequences belonging to 9001 species.
fun_dat

A dataset of 2726 fungal barcode sequences belonging to 1363 fungal species, where each species has exactly 2 sequence.
predict_test_funbarRF

Prediction of species label for the query fungal barcode sequences.
seq_funbarRF_manual

Conversion of barcode sequences manually collected from BOLD database into numeric features based on gap pair compositions.
data_barcode

Barcode sequences for five differet taxonomical entities i.e., Fish, Bat, Inga, Drosophila and Cypraeidae
encGPC

Encoding barcode sequences using gap-pair compositional features.
WarcupRDS

Warcup training dataset which is trained with funbarRF.
seq_funbarRF

Conversion of barcode sequences into numeric vectors based on gap pair compositions, with user supplied barcode sequences and species labels.
predict_train_funbarRF

Prediction of species labels for the out-of-bag (OOB) reference barcode sequence using Random Forest.
read_seq_txt

Conversion of DNA sequences of character types to DNAStringSet types.