Numero (version 1.2.0)

nroTrain: Train self-organizing map

Description

Iterative algorithm to adapt a self-organizing map (SOM) to a set of multivariable data.

Usage

nroTrain(som, data, subsample = NULL, metric = "euclid")

Arguments

som

A list object as returned by nroKohonen().

data

A matrix or a data frame.

subsample

Number of rows used during a single training cycle.

metric

Distance metric in data space, either "euclid" or "pearson".

Value

A copy of the list object som, where the element centroids is updated according to the data patterns. The quantization errors during training are stored in the element history and the element metric is set to the distance measure used.

Details

The model is fitted according to columns that are found both in the SOM centroids and the input data.

If subsample is less than the number of data rows, a random subset of the specified size is used for each training cycle.

References

Gao S, Mutter S, Casey AE, M<U+00E4>kinen V-P (2018) Numero: a statistical framework to define multivariable subgroups in complex population-based datasets, Int J Epidemiology, https://doi.org/10.1093/ije/dyy113

Examples

Run this code
# NOT RUN {
# Import data.
fname <- system.file("extdata", "finndiane.txt", package = "Numero")
dataset <- read.delim(file = fname)

# Prepare training data.
trvars <- c("CHOL", "HDL2C", "TG", "CREAT", "uALB")
trdata <- scale.default(dataset[,trvars]) 

# K-means clustering.
km <- nroKmeans(data = trdata)

# Train with full data.
sm <- nroKohonen(seeds = km)
sm <- nroTrain(som = sm, data = trdata)
print(sm$history)

# Train with subsampling.
sm <- nroKohonen(seeds = km)
sm <- nroTrain(som = sm, data = trdata, subsample = 200)
print(sm$history)
# }

Run the code above in your browser using DataCamp Workspace