pls.pathmox: PATHMOX-PLS: Extended Segmentation Trees in Partial Least Squares Structutal Equation Modeling (PLS-SEM)

Description

The function pathmox.pls calculates a binary segmentation tree in the context PLS-SEM following the PATHMOX algorithm. It allows heterogeneity to be detected in PLS-SEM models when the segmentation variables (categorical variables), external to the model, are available and when the objective of the research is exploratory. Pathmox adapts the principles of binary segmentation processes to produce a tree with different models in each of the obtained nodes. Unlike classic decision trees, pathmox does not aim to predict predefined classes, but to detect different models present in the data. To this end, it identifies the splits (based on the segmentation variables) that maximally discriminate between models. Each binary split defines a pair of nodes, each of which will have an associated structural model, i.e., an associated set of path coefficients. A global comparison test on the identity of the two models is then run. To avoid overfitting, pathmox adopts a pre-pruning process (i.e., stopping rules) based on maximum depth, minimum size of nodes and non-significance of the F-statistic.

Usage

pls.pathmox(
  x,
  inner,
  outer,
  mode,
  scheme = "path",
  scaling = NULL,
  scaled = TRUE,
  SVAR,
  signif = 0.05,
  deep,
  method = "lm",
  size,
  tree = TRUE,
  n.node = 30,
  ...
)

Arguments

matrix or data frame containing the manifest variables.

inner

A square (lower triangular) boolean matrix representing the inner model (i.e. the path relationships between latent variables).

outer

list of vectors with column indices or column names from x indicating the sets of manifest variables forming each block (i.e. which manifest variables correspond to each block).

mode

character vector indicating the type of measurement for each block. Possible values are: "A", "B", "newA", "PLScore", "PLScow". The length of mode must be equal to the length of outer.

scheme

string indicating the type of inner weighting scheme. Possible values are "centroid", "factorial", or "path".

scaling

optional argument for runing the non-metric approach; it is a list of string vectors indicating the type of measurement scale for each manifest variable specified in outer. scaling must be specified when working with non-metric variables. Possible values: "num" (linear transformation, suitable for numerical variables), "raw" (no transformation), "nom" (non-monotonic transformation, suitable for nominal variables), and "ord" (monotonic transformation, suitable for ordinal variables).

scaled

whether manifest variables should be standardized. Only used when scaling = NULL. By the default (TRUE, data is scaled to standardized values (mean=0 and variance=1).

SVAR

A data frame of factors contaning the segmentation variables.

signif

A numeric value indicating the significance threshold of the F-statistic. Must be a decimal number between 0 and 1.

deep

An integer indicating the depth level of the tree. Must be an integer greater than 1.

method

A string indicating the criterion used to calculate the the test can be equal to "lm" or "lad".

size

A numeric value indicating the minimum size of elements inside a node.

tree

A string indicating if the tree plot must be showed. By default is equal to TRUE

n.node

It is the minimum number of individuals to consider a candidate partition (30 by default).

…

Further arguments passed on to pls.pathmox.

Value

An object of class "xtree.pls". Basically a list with the following results:

MOX

Data frame with the results of the segmentation tree

root

List of elements contanined in the root node

terminal

List of elements contanined in terminal nodes

nodes

List of elements contanined in all nodes: terminal and intermediate

candidates

List of data frames containing the candidate splits of each node partition

Fg.r

Data frame containing the results of the F-global test for each node partition

Fc.r

A list of data frames containing the results of the F-coefficients test for each node partition

model

Informations about the internal paramenters

hybrid

a hybird categorical factor defined according to the final segments idenfied by pathmox

Details

The argument x must be a data frame containing the manifest variables of the PLS-SEM model.

The argument inner is a matrix of zeros and ones that indicates the structural relationships between latent variables. inner must be a lower triangular matrix; it contains a 1 when column j affects row i, 0 otherwise.

The argument SVAR must be a data frame containing segmentation variables as factors. The number of rows in SVAR must be the same as the number of rows in the data used in x.

The argument signif represent the p-value level takes as reference to stop the tree partitions. Defaults value is 0.05.

The argument deep represent the depth level of the tree takes as reference to stop the tree partitions.

The argument method is a string contaning the criterion used to calculate the tests; if method="lm" the classic least square approach is used to perform the tests; if method="lad" a LAD (least absolute deviation regression) aproximation of the test is used.

The argument size is defined as a decimal value (i.e. proportion of elements inside a node).

The argument n.node is the minimum number of individuals to consider a candidate partition. If the candidate split produces a partition where the number of individuals is less then n.node, the partition is not considered.

References

Lamberti, G. (2021) Hybrid multigroup partial least squares structural equation modelling: an application to bank employee satisfaction and loyalty. Quality and Quantity; doi: 10.1007/s11135-021-01096-9;

Lamberti, G. et al. (2017) The Pathmox approach for PLS path modeling: Discovering which constructs differentiate segments.. Applied Stochastic Models in Business and Industry; doi: 10.1002/asmb.2270;

Lamberti, G. et al. (2016) The Pathmox approach for PLS path modeling segmentation. Applied Stochastic Models in Business and Industry; doi: 10.1002/asmb.2168;

Lamberti, G. (2015) Modeling with Heterogeneity. PhD Dissertation.

Examples

Run this code

# NOT RUN {
# }
# NOT RUN {
 ## example of PLS-PM in bank customer satisfaction
 
data(csibank)

# select manifest variables
data.bank <-csibank[,6:32]

# define inner model matrix
Image 			  = rep(0,6)
Expectation	  = c(1,0,0,0,0,0)
Quality		    = c(0,1,0,0,0,0)
Value			    = c(0,1,1,0,0,0)
Satis			    = c(1,1,1,1,0,0)
Loyalty       = c(1,0,0,0,1,0)
inner.bank = rbind(Image,Expectation, Quality, Value, Satis,Loyalty)
colnames(inner.bank) = rownames(inner.bank)

# blocks of indicators (outer model)
outer.bank  = list(1:6,7:10,11:17,18:21,22:24,25:27)
modes.bank = rep("A", 6)


# re-ordering those segmentation variables with ordinal scale 
seg.bank= csibank[,1:5]

seg.bank$Age = factor(seg.bank$Age, ordered=TRUE)
seg.bank$Education = factor(seg.bank$Education, ordered=TRUE)


# Pathmox Analysis
bank.pathmox=pls.pathmox(data.bank, inner.bank, outer.bank, modes.bank,SVAR=seg.bank,signif=0.05,
                         deep=2,size=0.2,n.node=20)
 
 
# }
# NOT RUN {
library(genpathmox)
data(csibank)

# select manifest variables
data.bank <-csibank[,6:32]

# define inner model matrix
Image 			  = rep(0,6)
Expectation	  = c(1,0,0,0,0,0)
Quality		    = c(0,1,0,0,0,0)
Value			    = c(0,1,1,0,0,0)
Satis			    = c(1,1,1,1,0,0)
Loyalty       = c(1,0,0,0,1,0)
inner.bank = rbind(Image,Expectation, Quality, Value, Satis,Loyalty)
colnames(inner.bank) = rownames(inner.bank)

# blocks of indicators (outer model)
outer.bank  = list(1:6,7:10,11:17,18:21,22:24,25:27)
modes.bank = rep("A", 6)


# re-ordering those segmentation variables with ordinal scale 
seg.bank= csibank[,1:5]

seg.bank$Age = factor(seg.bank$Age, ordered=TRUE)
seg.bank$Education = factor(seg.bank$Education, ordered=TRUE)


# Pathmox Analysis
bank.pathmox=pls.pathmox(data.bank, inner.bank, outer.bank, modes.bank,SVAR=seg.bank,signif=0.05,
                         deep=2,size=0.2,n.node=20)


# }

Run the code above in your browser using DataLab