Learn R Programming

FlexRL (version 0.1.0)

loglikSurvival: The log likelihood of the survival function with exponential model (-)

Description

Log(likelihood) of the survival function with exponential model (as proposed in our paper), representing the probability that true values of a pair of records referring to the same entity coincide. See ?FlexRL::SurvivalUnstable. This function is only used if the PIV is unstable and evolve over time. If so the true values of a linked pair of records may not coincide. If you want to use a different survival function to model instability, you can change the function 'SurvivalUnstable' as well as this function 'loglikSurvival'.

Usage

loglikSurvival(alphas, X, times, Hequal)

Value

The value of the opposite (-) of the log(likelihood) associated with the survival function defining the probabilities that true values coincide (as defined in the paper) (the algorithm minimises -log(likelihood) i.e. maximises the log(likelihood)).

Arguments

alphas

A vector of size 1+cov in A+cov in B with coefficients of the hazard (baseline hazard and conditional hazard)

X

A matrix with number of linked records rows and 1+cov in A+cov in B columns (first column: intercept, following columns: covariates from A and then from B to model instability) (used for optimisation: X concatenate the X obtained in each iteration of the Gibbs sampler)

times

A vector of size number of linked records with the time gaps between the record from each sources (used for optimisation: times concatenate the times vectors obtained in each iteration of the Gibbs sampler)

Hequal

A vector of size number of linked records with boolean values indicating wether the values in A and in B coincide (used for optimisation: times concatenate the times vectors obtained in each iteration of the Gibbs sampler)

Details

In our Stochastic Expectation Maximisation (StEM) algorithm (see ?FlexRL::StEM) we minimise - log(likelihood), which is equivalent to maximise log(likelihood). Therefore this function actually returns (and should return if you create your own) the opposite (-) of the log(likelihood) associated with the survival function defining the probabilities that true values coincide.

Examples

Run this code
nCoefUnstable = 1
alphaInit = rep(-0.05, nCoefUnstable)
Valpha = base::data.frame(list(cov=c(2,2.1,3.4,2.5,2.9),
                               times=c(0.001,0.2,1.3,1.5,2),
                               Hequal=c(TRUE, TRUE, TRUE, FALSE, FALSE)))
X = Valpha[,1:nCoefUnstable]
times = Valpha$times
Hequal = Valpha$Hequal
optim = stats::nlminb(alphaInit, loglikSurvival, control=list(trace=FALSE),
                      X=X, times=times, Hequal=Hequal)
alpha = optim$par

Run the code above in your browser using DataLab