pls.pathmox: PATHMOX-PLS: Extended Segmentation Trees in Partial Least Squares Structutal Equation Modeling (PLS-SEM)

Description

The function pathmox.pls calculates a binary segmentation tree in the context PLS-SEM following the PATHMOX algorithm. The procedure can be resumed in the following way. It starts with the estimation of the global PLS-SEM Model at the root node. Then, using the segmentation variables, all possible binary splits of data are produced, and for each partition local models are calculated. Among all the splits, the best one is selected by means of the F-test comparing the inner models. This process is recursively applied for each child node. The stop criterion is based on the significance level of the p-value associated with the F statistic. Additionally, two stop parameters are also considered: the number of individuals in a node and the growing level of the depth of the tree. This function extends the pathmox algorithm introduced by Sanchez in 2009 including the two new test: the F-block test (to detect the responsible latent endogenous equations of the difference), the F-coefficient test (to detect the path coefficients responsible of the difference).The F-tests used in the split process are implemented following the classic lest square estimation. An implementation of the tests following the LAD regression also are proposed to overcome the parametric hypothesis of the F-test.

Usage

pls.pathmox(
  x,
  inner,
  outer,
  mode,
  scheme = "path",
  scaling = NULL,
  scaled = TRUE,
  SVAR,
  signif,
  deep,
  method = "lm",
  size,
  X = NULL,
  n.node = 30,
  ...
)

Arguments

matrix or data frame containing the manifest variables.

inner

A square (lower triangular) boolean matrix representing the inner model (i.e. the path relationships between latent variables).

outer

list of vectors with column indices or column names from x indicating the sets of manifest variables forming each block (i.e. which manifest variables correspond to each block).

mode

character vector indicating the type of measurement for each block. Possible values are: "A", "B", "newA", "PLScore", "PLScow". The length of mode must be equal to the length of outer.

scheme

string indicating the type of inner weighting scheme. Possible values are "centroid", "factorial", or "path".

scaling

optional argument for runing the non-metric approach; it is a list of string vectors indicating the type of measurement scale for each manifest variable specified in outer. scaling must be specified when working with non-metric variables. Possible values: "num" (linear transformation, suitable for numerical variables), "raw" (no transformation), "nom" (non-monotonic transformation, suitable for nominal variables), and "ord" (monotonic transformation, suitable for ordinal variables).

scaled

whether manifest variables should be standardized. Only used when scaling = NULL. By the default (TRUE, data is scaled to standardized values (mean=0 and variance=1).

SVAR

A data frame of factors contaning the segmentation variables.

signif

A numeric value indicating the significance threshold of the F-statistic. Must be a decimal number between 0 and 1.

deep

An integer indicating the depth level of the tree. Must be an integer greater than 1.

method

A string indicating the criterion used to calculate the the test can be equal to "lm" or "lad".

size

A numeric value indicating the minimum size of elements inside a node.

Optional dataset (matrix or data frame) used when argument dataset=NULL inside pls.

n.node

It is the minimum number of individuals to consider a candidate partition (30 by default).

…

Further arguments passed on to pls.pathmox.

Value

An object of class "xtree.pls". Basically a list with the following results:

MOX

Data frame with the results of the segmentation tree

root

List of elements contanined in the root node

terminal

List of elements contanined in terminal nodes

nodes

List of elements contanined in all nodes: terminal and intermediate

candidates

List of data frames containing the candidate splits of each node partition

Fg.r

Data frame containing the results of the F-global test for each node partition

Fb.r

List of data frames containing the results of the F-block test for each node partition

Fc.r

A list of data frames containing the results of the F-coefficients test for each node partition

model

Informations about the internal paramenters

hybrid

a hybird categorical factor defined according to the final segments idenfied by pathmox

Details

The argument x must be a data frame containing the manifest variables of the PLS-SEM model

The argument inner is a matrix of zeros and ones that indicates the structural relationships between latent variables. inner must be a lower triangular matrix; it contains a 1 when column j affects row i, 0 otherwise.

The argument SVAR must be a data frame containing segmentation variables as factors. The number of rows in SVAR must be the same as the number of rows in the data used in x.

The argument signif represent the p-value level takes as reference to stop the tree partitions.

The argument deep represent the depth level of the tree takes as reference to stop the tree partitions.

The argument method is a string contaning the criterion used to calculate the tests; if method="lm" the classic least square approach is used to perform the tests; if method="lad" the LAD (least absolute deviation regression) is used.

The argument size is defined as a decimal value (i.e. proportion of elements inside a node).

The argument n.node is the minimum number of individuals to consider a candidate partition. If the candidate split produces a partition where the number of individuals is less then n.node, the partition is not considered.

References

Lamberti, G. et al. (2017) The Pathmox approach for PLS path modeling: Discovering which constructs differentiate segments.Applied Stochastic Models in Business and Industry; doi: 10.1002/asmb.2270;

Lamberti, G. et al. (2016) The Pathmox approach for PLS path modeling segmentation. Applied Stochastic Models in Business and Industry; doi: 10.1002/asmb.2168;

Lamberti, G. (2014) Modeling with Heterogeneity. PhD Dissertation.

Examples

Run this code

# NOT RUN {
# }
# NOT RUN {
 ## example of PLS-PM in alumni satisfaction
 
 data(fibtele)
 
 # select manifest variables
 data.fib <-fibtele[,12:35]
 
 # define inner model matrix
 Image 			= rep(0,5)
Qual.spec	  = rep(0,5)
Qual.gen		= rep(0,5)
Value			  = c(1,1,1,0,0)
Satis			  = c(1,1,1,1,0)
 inner.fib = rbind(Image,Qual.spec, Qual.gen, Value, Satis)
 colnames(inner.fib) = rownames(inner.fib)
 
 # blocks of indicators (outer model)
 outer.fib  = list(1:8,9:11,12:16,17:20,21:24)
 modes.fib  = rep("A", 5)
 
                 
 # re-ordering those segmentation variables with ordinal scale 
  seg.fib= fibtele[,2:11]
 
 seg.fib$Age = factor(seg.fib$Age, ordered=T)
 seg.fib$Salary = factor(seg.fib$Salary, 
		levels=c("<18k","25k","35k","45k",">45k"), ordered=T)
 seg.fib$Accgrade = factor(seg.fib$Accgrade, 
		levels=c("accnote<7","7-8accnote","accnote>8"), ordered=T)
 seg.fib$Grade = factor(seg.fib$Grade, 
    levels=c("<6.5note","6.5-7note","7-7.5note",">7.5note"), ordered=T)

 # Pathmox Analysis
 fib.pathmox=pls.pathmox(data.fib, inner.fib, outer.fib, modes.fib,SVAR=seg.fib,signif=0.05,
				deep=2,size=0.2,n.node=20)
 
 
# }
# NOT RUN {
 library(genpathmox)
 data(fibtele)
 
 # select manifest variables
 data.fib <-fibtele[1:50,12:35]
 
 # define inner model matrix
 Image       = rep(0,5)
Qual.spec		= rep(0,5)
Qual.gen		= rep(0,5)
Value			  = c(1,1,1,0,0)
Satis			  = c(1,1,1,1,0)
 inner.fib = rbind(Image,Qual.spec, Qual.gen, Value, Satis)
 colnames(inner.fib) = rownames(inner.fib)

 # blocks of indicators (outer model)
 outer.fib = list(1:8,9:11,12:16,17:20,21:24)
 modes.fib = rep("A", 5)
 
                 

 # re-ordering those segmentation variables with ordinal scale 
 seg.fib = fibtele[1:50,c(2,7)]
seg.fib$Salary = factor(seg.fib$Salary, 
		levels=c("<18k","25k","35k","45k",">45k"), ordered=TRUE)

 # Pathmox Analysis
 fib.pathmox=pls.pathmox(data.fib, inner.fib, outer.fib, modes.fib,SVAR=seg.fib,signif=0.05,
				deep=2,size=0.2,n.node=20)


# }

Run the code above in your browser using DataLab