variantspark (version 0.1.1)

vs_importance_analysis: Importance Analysis

Description

This function performs an Importance Analysis using random forest algorithm. For more details, please look at here.

Usage

vs_importance_analysis(vsc, vcf_source, labels, n_trees)

Arguments

vsc

A variantspark connection.

vcf_source

An object with VCFFeatureSource class, usually the output of the vs_read_vcf().

labels

An object with CsvLabelSource class, usually the output of the vs_read_labels().

n_trees

The number of trees using in the random forest.

Value

spark_jobj, shell_jobj

Examples

Run this code
# NOT RUN {
library(sparklyr)
sc <- spark_connect(master = "local")
vsc <- vs_connect(sc)

hipster_vcf <- vs_read_vcf(vsc, 
                           system.file("extdata/hipster.vcf.bz2",
                                       package =  "variantspark"))

labels <- vs_read_labels(vsc, 
                         system.file("extdata/hipster_labels.txt",
                                      package =  "variantspark"))

vs_importance_analysis(vsc, hipster_vcf, labels, 10)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab