island (version 0.2.4)

upgma_model_selection: Model selection function based on a upgma grouping algorithm

Description

upgma_model_selection function conducts a model selection procedure intended to find an optimal partition that mimimize AIC values. Maximum likelihood estimation of model parameters (Colonization, Extinction) or (Colonization, Extinction, Detectability, P_0) is performed assuming either perfect detectability or imperfect detectability, respectively. In the latter case, the input data frame should contain multiple transects per sampling time. This function can handle missing data defining a heterogeneous sampling structure across the rows of the input data matrix. The function generates, as an output, a Sx6 matrix with the following 6 columns (for the S diffirent partitions)): (No of Model Parameters, NLL, AIC, AIC_c, AIC_d, AIC_w) which compares all upgma-generarated partitions.

Usage

upgma_model_selection(Data, Time, Factor, Tags, Colonization = 1,
  Extinction = 1, Detectability_Value = 0.5, Phi_Time_0_Value = 0.5,
  Tol = 1e-08, MIT = 100, C_MAX = 10, C_min = 0, E_MAX = 10,
  E_min = 0, D_MAX = 0.99, D_min = 0, P_MAX = 0.99, P_min = 0.01,
  I_0 = 0, I_1 = 1, I_2 = 2, I_3 = 3, z = 2, Verbose = 0,
  MV_FLAG = 0.1, PerfectDetectability = TRUE)

Arguments

Data

data frame containing presence data per time (in cols) and sites (in rows)

Time

an array of length n containing sampling times

Factor

column number containing the 'data frame' factor used to split total data into level-based groups

Tags

array of factor level names: name[i] is the level tag (short name) for the i-th level.

Colonization

guess value to initiate search / parameter value

Extinction

guess value to initiate search / parameter value

Detectability_Value

guess value to initiate search / parameter value

Phi_Time_0_Value

guess value to initiate search / parameter value

Tol

stopping criteria of the search algorithm.

MIT

max number of iterations of the search algorithm.

C_MAX

max value for the possible range of colonization values

C_min

min value for the possible range of colonization values

E_MAX

max value for the possible range of colonization values

E_min

min value for the possible range of colonization values

D_MAX

max value for the possible range of colonization values

D_min

min value for the possible range of colonization values

P_MAX

max value for the possible range of colonization values

P_min

min value for the possible range of colonization values

I_0

has to be 0, 1, 2, or 3. Defaults to 0

I_1

has to be 0, 1, 2, or 3. Defaults to 1

I_2

has to be 0, 1, 2, or 3. Defaults to 2

I_3

has to be 0, 1, 2, or 3. Defaults to 3

z

dimension of the parameter subspace for which the optimization process will take place. Default is 2

Verbose

more/less (1/0) information about the optimization algorithm will be printed out.

MV_FLAG

missing value flag (to specify sites and times where no sample exists)

PerfectDetectability

TRUE means 'Perfect Detectability'. Of course, FALSE means 'Imperfect Detectability'

Details

The output matrix contains a row for the S different binary partitions of the set of S groups. Searches are conducted using Nelder-Mead simplex method in a bounded parameter space which means that in case a neg loglikelihood (NLL) evaluation is called out from these boundaries, the returned value for this NLL evaluation is artifically given as the maximum number the machine can hold. The input is a data frame containing presence data per time (in cols) and sites (in rows). Different factors (for instance, OTU, location, etc) can slide the initial data frame in their different levels, accordingly. Each initial group (usually, species, OUTs, factors, ...) is named by a short-length-character label (ideally, 3 or 4 characters). The length of Tags array should match the number of levels in which the given factor is subdivided. All labels should have the same character length to fulfill memmory alignment requriement of the shared object called by .C(...) function. I_0, I_1, I_2, and I_3 are model parameter keys. They are used to define a 4D-vector (Index). The model parameter keys correspond to the colonization (0), extinction (1), detectability (2), and Phi_0 (3) model parameters in case detectability is imperfect or, alternatively, only colonization (0) and extinction (1) in case detectability is perfect. For instance, if (I_0, I_1) is (1, 0), searches will take place within the paremeter space defined by extinction, as the first axis, and colonization, as the second.

Examples

Run this code
# NOT RUN {
Data <- lakshadweepPLUS[[1]]
Guild_Tag = c("Alg", "Cor", "Mac", "Mic", "Omn", "Pis", "Zoo")
Time <- as.vector(c(2000, 2000, 2001, 2001, 2001, 2001, 2002, 2002, 2002, 
2002, 2003, 2003, 2003, 2003, 2010, 2010, 2011, 2011, 2011, 2011, 2012, 
2012, 2012, 2012, 2013, 2013, 2013, 2013))
R <- upgma_model_selection(Data, Time, Factor = 3, Tags = Guild_Tag, 
PerfectDetectability = FALSE, z = 4)
Guild_Tag = c("Agt", "Kad", "Kvt")
R <- upgma_model_selection(Data, Time, Factor = 2, Tags = Guild_Tag, 
PerfectDetectability = FALSE, z = 4)
# }

Run the code above in your browser using DataLab