Learn R Programming

Many Geographical Analysis utilizes spatial autocorrelation, that allows us to study the geographical evolution from different points of view. One measurement for spatial autocorrelation is Moran's I, that is based on Pearson’s correlation coefficient in general statistics arXiv:1606.03658

Performing the Analysis

This package offers a straight fordward to perform the whole analisys by using the function rescaleI which requires an input file with a specific format you can see it at [Loading data] section

library(Irescale)
fileInput<-system.file("testdata", "chen.csv", package="Irescale")
data<-loadFile(fileInput)
scaledI<-rescaleI(data,samples=1000, scalingUpTo="MaxMin")
fn = file.path(tempdir(),"output.csv",fsep = .Platform$file.sep)
saveFile(fn,scaledI)
if (file.exists(fn)) 
  #Delete file if it exists
  file.remove(fn)

Analysis Step by Step

The analysis can be done following the steps

Loading data

The input file^[The data used in this example is taken from [@chen2009].] should have the following format.

  • The first column represents an unique id for the record.
  • The second and third column represent the latitute and longitud of where the sample was taken
  • The fourth and beyond represents the different measured variables
fileInput<-system.file("testdata", "chen.csv", package="Irescale")
head(read.csv(fileInput))

To load data to performe the analysis is quite simple. The function loadFile provides the interface to make it. loadFile returns a list with two variables, data and varOfInterest, the first one represents a vector with latitude and longitude; varOfInterest is a matrix with all the measurements from the field.

library(Irescale)
fileInput<-system.file("testdata", "chen.csv", package="Irescale")
input<-loadFile(fileInput)
head(input$data)
head(input$varOfInterest)

If the data has a chessboard shape,the file is organized in rows and columns, where the rows represent latitute and columns longitude, the measurements are in the cell. The function loadChessBoard can be used to load into the analysis.

library(Irescale)
fileInput<-"../inst/testdata/chessboard.csv"
input<-loadChessBoard(fileInput)
head(input$data)
head(input$varOfInterest)

Calculate Distance

Once the data is loaded, The distance matrix, the distance between all the points might be calcualted. The distance can be calculated using `calculateEuclideanDistance' if the points are taken in a geospatial location.

library(Irescale)
fileInput<-system.file("testdata", "chen.csv", package="Irescale")
input<-loadFile(fileInput)
distM<-calculateEuclideanDistance(input$data)
distM[1:5,1:5]

If the data is taken from a chessboard a like field, the Manhattan distance can be used.

library(Irescale)
fileInput<-"../inst/testdata/chessboard.csv"
input<-loadChessBoard(fileInput)
distM<-calculateManhattanDistance(input$data)
distM[1:5,1:5]

Calculate Weighted Distance Matrix

The weighted distance matrix can be calculated it using the function calculateWeightedDistMatrix, however it is not required to do it, because 'calculateMoranI' does it.

library(Irescale)
fileInput<-system.file("testdata", "chen.csv", package="Irescale")
input<-loadFile(fileInput)
distM<-calculateEuclideanDistance(input$data)
distW<-calculateWeightedDistMatrix(distM)
distW[1:5,1:5]

Moran's I

It is time to calculate the spatial autocorrelation statistic Morans' I. The function calcualteMoranI, which requires the distance matrix, and the variable you want are interested on.

library(Irescale)
fileInput<-system.file("testdata", "chen.csv", package="Irescale")
input<-loadFile(fileInput)
distM<-calculateEuclideanDistance(input$data)
I<-calculateMoranI(distM = distM,varOfInterest = input$varOfInterest)
I

Resampling Method for I

The scaling process is made using Monte Carlo resampling method. The idea is to shuffle the values and recalculate I for at least 1000 times. In the code below, after resampling the value of I, a set of statistics are calculated for that generated vector.

library(Irescale)
fileInput<-system.file("testdata", "chen.csv", package="Irescale")
input<-loadFile(fileInput)
distM<-calculateEuclideanDistance(input$data)
I<-calculateMoranI(distM = distM,varOfInterest = input$varOfInterest)
vI<-resamplingI(1000,distM, input$varOfInterest) # This is the permutation
statsVI<-summaryVector(vI)
statsVI

Plotting Distribution (Optional)

To see how the value of I is distribuited, the method plotHistogramOverlayNormal provides the functionality to get a histogram of the vector generated by resampling with a theorical normal distribution overlay.

library(Irescale)
fileInput<-system.file("testdata", "chen.csv", package="Irescale")
input<-loadFile(fileInput)
distM<-calculateEuclideanDistance(input$data)
I<-calculateMoranI(distM = distM,varOfInterest = input$varOfInterest)
vI<-resamplingI(1000,distM, input$varOfInterest) # This is the permutation
statsVI<-summaryVector(vI)
plotHistogramOverlayNormal(vI,statsVI, main=colnames(input$varOfInterest))

Rescaling I

Once we have calculated the null distribution via resampling, you need to scale by centering and streching. The method iCorrection, return an object with the resampling vector rescaled, and all the summary for this vector, the new value of I is returned in a variable named newI

library(Irescale)
fileInput<-system.file("testdata", "chen.csv", package="Irescale")
input<-loadFile(fileInput)
distM<-calculateEuclideanDistance(input$data)
I<-calculateMoranI(distM = distM,varOfInterest = input$varOfInterest)
vI<-resamplingI(1000,distM, input$varOfInterest) # This is the permutation
statsVI<-summaryVector(vI)
corrections<-iCorrection(I,vI)
corrections$newI

Calculate P-value

In order to provide a significance to this new value, you can calculate the pvalue using the method calculatePvalue. This method requires the scaled vector, you get this vector,scaledData, the scaled I, newI and the mean of the scaledData.

library(Irescale)
fileInput<-system.file("testdata", "chen.csv", package="Irescale")
input<-loadFile(fileInput)
distM<-calculateEuclideanDistance(input$data)
I<-calculateMoranI(distM = distM,varOfInterest = input$varOfInterest)
vI<-resamplingI(1000,distM, input$varOfInterest) # This is the permutation
statsVI<-summaryVector(vI)
corrections<-iCorrection(I,vI)
pvalueIscaled<-calculatePvalue(corrections$scaledData,corrections$newI,corrections$summaryScaledD$mean)
pvalueIscaled

Stability Analysis

In order to determine how many iterations it is necessary to run the resampling method, it is possible to run a stability analysis. This function draw a chart in log scale (10^x) of the number of interations needed to achieve the stability in the Monte Carlo simulation.

fileInput<-system.file("testdata", "chen.csv", package="Irescale")
input<-loadFile(fileInput)
resultsChen<-buildStabilityTable(data=input, times=100, samples=1000, plots=TRUE)

Copy Link

Version

Install

install.packages('Irescale')

Monthly Downloads

185

Version

2.3.0

License

GPL (>= 2)

Maintainer

Ivan Fuentes

Last Published

November 21st, 2019

Functions in Irescale (2.3.0)

loadChessBoard

Loads a chessboard or matrix alike input file.
loadDistanceMatrix

Loads a distance matrix. Instead of computing the distance from latitute and longitude LoadDistanceMatrix Loads the distance matrix, avoiding computing it from latitude and longitude.
coor

Transforms a x,y position in a cartesian plane into a position in a 1D array.
convexHull

Plots the convexhull polygon from the data (latitude, longitude), and calculates the center of the convexhull and its area.
plotHistogramOverlayCorrelation

Creates an overlay of the histogram of the data and the theorical normal distribution.
rescaleI

Performs the rescale for all the variables in an input file.
saveFile

Saves a report with important statistics to describe the sample.
plotHistogramOverlayNormal

Creates an overlay of the histogram of the data and the theorical normal distribution.
transformImageToMatrix

Transforms the image to a matrix.
transformImageToList

Transforms the image in the object need it to run the analysis.
loadSatelliteImage

Loads a Satellite image in PNG format
loadFile

Loads a file with latitude, longitude and variable of interest
calculateDistMatrixFromBoard

Calculates the distance in a chessboard-alike structure.
expectedValueI

Calculates the expected value for local I
localICorrection

Scaling process for Local Moran's I.
nullDristribution

Calculate a distribution of how the var of interest is correlated to a
resamplingLocalI

Calculates n permutations of the variable of interest to calculate n different I in order to create the \(Null\) distribution.
summaryVector

Calculates statistic for the received vector.
summaryLocalIVector

Calculates statistic for the received Matrix.
resamplingI

Calculates n permutations of the variable of interest to calculate n different I in order to create the \(Null\) distribution.
standardizedByColumn

Scales a matrix by column.
standardize

Standardize the input vector
procrustes

Procrustes distance between two surfaces
iCorrection

Scaling process for Moran's I.
rectifyIrho

Rectify I using a correlation method for all the variables in an input file.
calculatePvalue

p-value calculation.
buildStabilityTableForCorrelation

Finds how many iterations are necessary to achieve stability in resampling method for rectifying I through pearson corrrelation.
calculateWeightedDistMatrix

Calculates a weighted representation of the distance matrix.
calculateEuclideanDistance

Given a 2D data structure, it calculates the euclidean distance among all the points.
ItoPearsonCorrelation

Calculate the equivalence r from the I percentile in the I-Null Distribution.
calculateLocalI

Computing the Local Moran's I
buildStabilityTable

Finds how many iterations are necessary to achieve stability in resampling method.
calculateMoranI

Calculates the Moran's I using the algorithm proposed by Chen chen2009Irescale.
calculateManhattanDistance

Calculates the manhattan distance.