Learn R Programming

SpaCCr (version 0.1.0)

SpaCC_Missing: Solve Spatial Convex Clustering problem for missing data

Description

Solve Spatial Convex Clustering problem for missing data

Usage

SpaCC_Missing(X, w, gamma, nu = 1/nrow(X), verbose = FALSE, tol.base = 1e-04, tol.miss = 1e-04, max.iter.base = 5000, max.iter.miss = 500, Uinit, Vinit, Laminit)

Arguments

X
A subject (n) by variable (p) matrix; the data
w
A vector of length p-1; weights for clustering
gamma
A positive scalar; regularization parameter
nu
A positive scalar; augmented Lagrangian paramter
verbose
Logical; should messages be printed?
tol.base
A small positive scalar; convergence tolerance for base SpaCC problem.
tol.miss
A small positive scalar; convergence tolerance for missing data problem.
max.iter.base
A positive integer; maximum number of iterations for base SpaCC problem
max.iter.miss
A positive integer; maximum number of iterations for missing data problem
Uinit
An n by p matrix; initial value for U
Vinit
An n by p-1 matrix; initial value for V
Laminit
An n by p-1 matrix; initial value for Lam

Value

A list with elements U,V, and Lam

Examples

Run this code
library(dplyr)
library(tidyr)
data("methy")
methy <- methy[1:20,1:10]
Coordinates <- methy$Genomic_Coordinate
methy %>%
  tbl_df() %>%
  select(-Chromosome,-Genomic_Coordinate) %>%
  gather(Subject,Value,-ProbeID) %>%
  spread(ProbeID,Value) -> X
SubjectLabels <- X$Subject
X <- X[,-1] %>% as.matrix()
X[1:5,1:5]
nsubj <- nrow(X)
nprobes <- ncol(X)
nweights <- choose(nprobes,2)
diff.vals <- diff(Coordinates)
too.far <- diff.vals > 20000
sig = 1/5e3
w.values <- exp(-sig*diff.vals)
w.values[too.far] = 0

verbose=TRUE
tol.base = 1e-4
tol.miss = 1e-4
max.iter.base=5000
max.iter.miss=500
bo <-t(scale(t(X),center=TRUE,scale=FALSE))
bo[is.na(bo)] <- mean(bo,na.rm=TRUE)
best.gam = 1
Sol <- SpaCC_Missing(t(scale(t(X),center=TRUE,scale=FALSE)),
                         w.values,
                         gamma = best.gam,
                         nu=1/nsubj,
                         verbose=TRUE,
                         tol.base=tol.base,
                         tol.miss=tol.miss,
                         max.iter.base=max.iter.base,
                         max.iter.miss=max.iter.miss,
                         bo,
                         t(diff(t(bo))),
                         t(diff(t(bo))))

Run the code above in your browser using DataLab