model.mapmake: Map Making

Description

Applies models to either ERDAS Imagine image (.img) files or ESRI Grids of predictors to create detailed prediction surfaces. It will handle large predictor files for map making, by reading in the .img files in rows, and output to the .img file the prediction for each data row, before reading the next row of data.

Usage

model.mapmake(model.obj= NULL, folder = NULL, MODELfn = NULL, 
rastLUTfn = NULL, na.action = NULL, na.value=-9999, keep.predictor.brick=FALSE, 
map.sd = FALSE, OUTPUTfn = NULL, quantiles=NULL, n.trees = NULL)

Arguments

model.obj

R model object. The model object to use for prediction. The model object must be of type "RF" (random forest), "QRF" (quantile random forest), "CF" (conditional forest), or "SGB" (stochast

folder

String. The folder used for all output from predictions and/or maps. Do not add ending slash to path string. If folder = NULL (default), a GUI interface prompts user to browse to a folder. To use the working directory, specify folde

MODELfn

String. The file name to use to save the generated model object. If MODELfn = NULL (the default), a default name is generated by pasting model.type_response.type_response.name. If the other output filenames are left unspecified

rastLUTfn

String. The file name (full path or base name with path specified by folder) of a .csv file for a rastLUT. Alternatively, a dataframe containing the same information. The rastLUT must include 3 columns:

na.action

String. Model validation. Specifies the action to take if there are NA values in the prediction data or if there is a level or class of a categorical predictor variable in the validation test set or the production (mapping) data set, but no

na.value

Number. Value that indicates NA in the predictor rasters. Defaults to -9999. Note: all predictor rasters must use the same value for NA.

keep.predictor.brick

Logical. Map Production. If TRUE then the raster brick containing the predictors from the model object is saved as a native raster package format file. If FALSE a temporary brick is created and then deleted at the end of map pro

map.sd

Logical. Map Production. If map.sd = TRUE, maps of mean, standard deviation, and coefficient of variation of the predictions from all the trees are generated for each pixel. If map.sd = FALSE (the default), only the predicted

OUTPUTfn

String. Map Production. Filename of output file for map production. The filename can be the full path, or it can be the simple basename, in which case the output will be to the folder specified by folder. If OUTPUTfn = NULL (t

quantiles

Numeric Vector. QRF models. The quantiles to predict. A numeric vector with values between zero and one. The quantile map output will be a multilayer raster with one layer for each quantile. If model.obj was created with the ModelMap<

n.trees

Integer. SGB models. The number of stochastic gradient boosting trees for an SGB model. If n.trees=NULL (the default) the model creation code will increase the number of trees 100 at a time until OOB error rate stops improving. The gb

Value

The model.mapmake() function does not return a value, instead it writes a raster file of map information (suitable for importing into a GIS) to the specified folder. The output raster is saved in the format specifed by the file extension of OUTPUTfn The model.mapmake() function also writes a text file listing the projections of all predictor rasters. For categorical response models, a csv file map key is written giving the integer code associated with each response category. If keep.predictor.brick = TRUE then a raster brick of all the predictor rasters from the model is also saved to the specified folder. If keep.predictor.brick = FALSE (the default) then the predictor brick is written to a temprary file, and deleted. Warning: the predictor bricks can be quite large, and saving them can require quite a bit of memory.

Details

model.mapmake() can be run in a traditional R command mode, where all arguments are specified in the function call. However it can also be used in a full push button mode, where you type in the simple command model.mapmake(), and GUI pop up windows will ask questions about the type of model, the file locations of the data, etc... When running model.mapmake() on non-Windows platforms, file names and folders need to be specified in the argument list, but other pushbutton selections are handled by the select.list() function, which is platform independent. The R package raster is used to read spatial rasters into R. The data for production mapping should be in the form of pixel-based raster layers representing the predictors in the model. If there is more than one predictor or raster layer, the layers must all have the same number of columns and rows. The layers must also have the same extent, projection, and pixel size, for effective model development and accuracy. The raster package function compareRaster() is used to check predictor layers for consistency. The layers must also be in (single or multi-band) raster data formats that can be read by package raster, for example ESRI Grid or ERDAS Imagine image files. The predictor layers must have continuous or categorical data values. See writeRaster for a list of available formats. To improve processing speed, the raster package is used to create a raster brick object with a layer for each predictor in the model. By default, this brick is a temporary file that is automatically deleated as soon as the map is completed. If keep.predictor.brick=TRUE, the predictor brick with be saved as a native raster package file, with a file name created by appending '_brick' to the OUTPUTfn. Warning: these bricks can be quite large, as they contain all the predictor data for every pixel in the map. When creating maps of non-rectangular study regions there may be large portions of the rectangle where you have no predictors, and are uninterested in making predictions. The suggested value for the pixels outside the study area is -9999. These pixels will be ignored in the predictions, thus saving computing time. The function model.mapmake() outputs an rater file of map information suitable to be imported into a GIS. Maps can also be imported back into R using the function raster() from the raster package. The file extension of OUTPUTfn determines the write format. If OUTPUTfn does not include a file extension, output will default to an ERDAS Imagine image file with extension ".img" For Binary response models the output is in the form of predicted probability of presence for each pixel. For Continuous response models the output is the predicted value for each pixel. For Categorical response models the map output depends on the category labels. If the categorical response variable is numeric, the map output will use the original numeric categories. If the categories are non-numeric (for example, character strings), map output is in the form of integer class codes for each pixel, coded for each level of the factored response, and a CSV file containing a look up table is also generated to associate the integer codes with the original values of the response categories. The first predictor from predList is used to determine projection of output Imagine Image file.

References

Breiman, L. (2001) Random Forests. Machine Learning, 45:5-32. Friedman, J.H. (2001). Greedy function approximation: a gradient boosting machine. Ann. Stat., 29(5):1189-1232. Friedman, J.H. (2002). Stochastic gradient boosting. Comput. Stat. Data An., 38(4):367-378. Liaw, A. and Wiener, M. (2002). Classification and Regression by randomForest. R News 2(3), 18--22. Ridgeway, G., (1999). The state of boosting. Comp. Sci. Stat. 31:172-181 Simpson, E. H. (1949). Measurement of diversity. Nature.

Examples

Run this code

###########################################################################
############################# Run this set up code: #######################
###########################################################################

# set seed:
seed=38

# Define training and test files:

qdata.trainfn = system.file("extdata", "helpexamples","DATATRAIN.csv", package = "ModelMap")

# Define folder for all output:
folder=getwd()	


#identifier for individual training and test data points

unique.rowname="ID"


###########################################################################
######################## Define the model: ################################
###########################################################################


########## Continuous Response, Continuous Predictors ############

#file name to store model:
MODELfn="RF_Bio_TC"				

#predictors:
predList=c("TCB","TCG","TCW")	

#define which predictors are categorical:
predFactor=FALSE	

# Response name and type:
response.name="BIO"
response.type="continuous"



###########################################################################
########################### build model: ##################################
###########################################################################


### create model ###

model.obj = model.build( model.type="RF",
                       qdata.trainfn=qdata.trainfn,
                       folder=folder,		
                       unique.rowname=unique.rowname,		
                       MODELfn=MODELfn,
                       predList=predList,
                       predFactor=predFactor,
                       response.name=response.name,
                       response.type=response.type,
                       seed=seed,
                       na.action="na.roughfix"
)



###########################################################################
############ Then Run this code to predict map pixels #####################
###########################################################################


### Create a the filename (including path) for the rast Look up Tables ###


rastLUTfn.2001 <- system.file(	"extdata",
				"helpexamples",
				"LUT_2001.csv",
				package="ModelMap")
                         
                 



### Load rast LUT table, and add path to the predictor raster filenames in column 1 ###

rastLUT.2001 <- read.table(rastLUTfn.2001,header=FALSE,sep=",",stringsAsFactors=FALSE)

for(i in 1:nrow(rastLUT.2001)){
	rastLUT.2001[i,1] <- system.file("extdata",
					"helpexamples",
					rastLUT.2001[i,1],
					package="ModelMap")
}                                 


### Define filename for map  output ###

OUTPUTfn.2001 <- "RF_BIO_TCandNLCD_01.img"
OUTPUTfn.2001 <- paste(folder,OUTPUTfn.2001,sep="/")


### Create image files of predicted map data ###

model.mapmake( model.obj=model.obj,
               folder=folder,		
               rastLUTfn=rastLUT.2001,
           # Mapping arguments						
               OUTPUTfn=OUTPUTfn.2001
               )


###########################################################################
################ run this code to create maps in R ########################
###########################################################################

### Define Color Ramp ###

l <- seq(100,0,length.out=101)
c <- seq(0,100,length.out=101)
col.ramp <- hcl(h = 120, c = c, l = l)


### read in map data ###

mapgrid.2001 <- raster(OUTPUTfn.2001)

#mapgrid.2001 <- setMinMax(mapgrid.2001)


### create map ###

dev.new(width = 5, height = 5)
opar <- par(mar=c(3,3,2,1),oma=c(0,0,3,4),xpd=NA)

zlim <- c(0,max(maxValue(mapgrid.2001)))
legend.label<-rev(pretty(zlim,n=5))
legend.colors<-col.ramp[trunc((legend.label/max(legend.label))*100)+1]

image(  mapgrid.2001, col = col.ramp, zlim=zlim, asp=1, bty="n", 
        xaxt="n", yaxt="n", main="", xlab="", ylab="")
mtext("2001 Imagery",side=3,line=1,cex=1.2)

legend(	x=xmax(mapgrid.2001),y=ymax(mapgrid.2001),
	legend=legend.label,
	fill=legend.colors,
	bty="n",
	cex=1.2
)
mtext("Predictions",side=3,line=1,cex=1.5,outer=TRUE)
par(opar)

Run the code above in your browser using DataLab