
Last chance! 50% off unlimited learning
Sale ends in
userinput
, and writes the processed data to a data.frame
. The data.frame
output of WUX (the WUX data frame) contains
the climate change signals for user-specified periods, regions,
seasons, and parameters for each of the indicated climate models as
defined in userinput
.
The userinput
is a named list
object or a file
containing a named list
. It passes the controlling parameters to
models2wux
. The file paths, file names and meta-information on the
climate simulations are stored in another list called
modelinput
. See the "Details" section and the "Configfile
userinput" and "Configfile modelinput" section for a detailed
description of these two lists.
models2wux(userinput, modelinput)
data.frame
of class c("wux.df", "data.frame")
containing climate change signals for all models,
subregions, and parameters specified in userinput
. It also
writes a csv file on your HDD.
models2wux
. parameter.names
A character vector of parameters to be processed according to the
NetCDF Climate and Forecast (CF) Metadata Convention
(http://cfconventions.org/),
e.g. parameter.names = c("air_temperature", "precipitation_amount")
. reference.period
A character specifying the climate change reference period defined
by "from-to" ("YYYY-YYYY"),
e.g. reference.period = "1961-1990"
. scenario.period
A character specifying the climate change future period defined
by "from-to" ("YYYY-YYYY"),
e.g. scenario.period = "2021-2050"
. temporal.aggregation
A named list containing the n different levels of statistical
aggregation where the single list elements are sequentially named by
stat.level.1, stat.level.2, stat.level.3, ... ,
stat.level.n. Each stat.level is again a list containing three
elements: period, statistic, and time.series.
subregions
Named list containing information for geographical
regions. You can specify the boundaries by passing
wrap.to
-tag
(currently defined only for shapefiles). area.fraction
Dealing with gridded data, subregions almost never happen do be cut out
exactly the way your subregion is specified. If the centroid of a
single data pixel lies within the subregion, this datapoint will
be taken into analysis, else the datapoint will be considered as
lying outside of the subregion and set NA
. This is WUX default behavior
(area.fraction = FALSE
). For very small subregions and/or
very course data resolution however, it can happen you get very
few data points or even none at all. However, if you want to take every data pixel which just 'touches' your
subregion, use area.fraction
. The pixel's centroid doesn't have to be
necessarily inside the subregion to be taken into analysis then. With
area.fraction = TRUE
WUX does a weighted spatial average of
all these pixels. The weight is the ratio of the pixel area lying
within the subregion and the entire pixel area. So if one quarter of a
data point is wihin the subregion (but its centroid for example is
not), the data pixel value will be taken into analysis and
weighted by 0.25
when averaging spatially. Pixels
being covered completely in the
subregion have weight 1
. area.fraction
is useful if
you are dealing with very small subregions and/or small data
resolution, resulting in just a few pixels. spatial.weighting
When averaging data over its spatial component, the simple arithmetic
mean can result in strongly biased areal estimates. The reason for this
is due the geographical projection of the data. The globe has 360 longitudinal
degrees and 180 degrees in latitude. The real distance (km)
between latitudes remains the same on the entire globe, whereas the
distances between longitudes depend on the latitude
considered. One degree in longitude near equator represents much
more distance (km) than one degree in Norway as the
longitudes converge at the poles. This fact has to be considered especially when dealing with global data
(e.g. GCMs). GCM data is usually (within WUX so far 100%)
stored on a rectangular lon-lat grid. Therefore the poles seem
overproportionaly large in area. Common practice is cosine
weighting of latides, resulting in smaller weights near the poles
and largest weights at the equator. See
http://www.grassaf.org/general-documents/gsr/gsr_10.pdf for
more details. spatial.weighting = TRUE
enables cosine weighting of
latitudes, whereas omitting or setting FALSE
results in unweighted
arithmetic areal mean (default). This option is valid only for
data on a regular grid. na.rm
It may happen that time slices of NetCDF data may be missing
and the user does not know anything about it. Reason for these artifacts might be short time series
(e.g. some models project only until 2035, so an analysis unitl 2050
would be biased) or simply missing values due to corrupt or missing
NetCDF files. If na.rm = TRUE
is set in the user input, missing values are
filled with NA, but the temporal statistics are calculated using the na.rm = TRUE
flag. na.rm = FALSE
keeps the NA values
and thus leads to NA statistics.
plot.subregions
A list containing information about diagnostic plotting of grid
points within the subregions. png
plots are generated
showing the grid points within a subregion. The size of the drawn
circles correspond to the weighting factor of area.fraction
.
The list contains three elements: save.subregion.plots
,
xlim
, and ylim
.
save.as.data
A character containing both the output path and
filename. For example save.as.data = "/tmp/cmip3"
will
save files in the directory /tmp/
as cmip3.csv
(data frame containing model climatologies), cmip3_diff.csv
(data frame containing the differences of the climatologies, i.e. the
climate change signals) and cmip3.Rdata
(a R binary file which
can be loaded into the next R session containing variables
wux.data
and wux.data.diff
data frames analog to the
csv-files). climate.models
A character vector containing the names of the models to be
processed. The names must be identical to the unique acronyms in
the modelinput
list. Read the next section if you want to
add a model in the modelinput
file. modelinput
list (which should be stored in a file). You don't
need to write tedious input routines, WUX does that for you. The
modelinput
list is a named list of climate models and
contains meta-information of all currently known climate
models. Sometimes models indicate wrong attributes in their NetCDF
files needed by modelinput
. Therfore: KNOW YOUR MODEL YOU
WANT TO ADD AND TAKE CARE OF THE META-INFORMATION YOU ARE INDICATING
IN modelinput
. Each tag consists of a named list with the following mandatory tags
(i.e. names): institute
Character indicating the institute which is developing the model. rcm
Character name indicating the RCM acronym; if you are processing a
gcm type "". gcm
Character name indicating the GCM acronym. emission
Type of emission scenario used for the simulation. gridfile.filename
Name of NetCDF grid file containing the lon/lat variables. gridfile.path
Directory of the NetCDF grid file. file.path.default
Default directory of the NetCDF data files. If the files are
stored not only in one directory, use the file.path.alt
tag
(see below). file.path.alt
If your files are stored not only in one directory, here you can
enter a named vector of paths. If files are scattered by parameter,
pass the parameter name (CF Metadata convention) as the vector
name. If they are split by periods, then pass
historical
and scenario
as vector
names. If files are seperated by both period and parameter, you
can use nested named lists instead of vectors. file.name
Character vector of file names of the NetCDF data
files. If there are different file names for parameters (which
will be mostly the case) and/or file names in scenario- and
historical period are of different nature as well, use named
or nested lists as in the file.path.alt
tag.
You can set this tag NA
if this climate model has no
files. This makes sense for example for the GKSS model for global
radiation, as this ENSEMBLES model does not provide this
parameter. Values for this model will be NA
in the WUX
dataframe. These tags are optional:
resolution
Grid resolution character. gcm.run
GCM run. Default is blank "". what.timesteps
Default are daily time steps, type "monthly"
for monthly
data. calendar
Define the NetCDF time:calender attribute by hand. This is
necessary if the NetCDF file contains wrong information. You can
pass 360_days
, no_leap
or julian
. time.units
Define the NetCDF time:units attribute by
hand. E.g. days since 1950-01-06 00:00:00
. count.first.time.value
The time variable in NetCDF files is a vector of time steps relative
to the "time:units" attribute with calendar according
to the "time:calendar" attribute. However, there are cases where
certain climate models are dealing with two calendar types at
once! Yes, that's possible... For example: Data claim to have a
"360 days" calendar.
The "time:units" attribute is set to days since 1961-01-01
00:00:00
and the time vector looks like
365, 366, ..., 723, 724
. The 365th day since 1961-01-01 is
definetely not the 1st January of 1962 concerning the 360-days
calendar but is correctly in terms of "julian" dates. In such a case we would set
count.first.time.value = "julian"
and calendar
remains 360 days. Other possibilities are
count.first.time.value = "noleap"
(or =
"360days"
). Currently this property is defined for calendar
= "360 days"
only, but can easily be extended to other
calendars as well. parameters
A named vector indicating parameter long- and shortname which
belong together, e.g.
parameters = c(air_temperature = "tas_dm",
precipitation_amount = "pr_24hc")
. This is important if the
NetCDF internal variable name deviates from the WUX default
parameter shortname:
tas |
for air_temperature |
pr |
for precipitation_amount |
hurs |
for relative_humidity |
rsds |
for global_radiation |
wss |
for wind_speed |
ua |
for eastward_wind |
va |
for northward_wind |
psl |
for air_pressure_at_sea_level |
hus |
for specific_humidity |
hfss |
for surface_upward_sensible_heat_flux |
tasmin |
for air_temperature_minimum |
tasmax |
for air_temperature_maximum |
ts |
for surface_temperature |
models2wux
needs two config files userinput
and modelinput
, both
being named list objects or files containing a named
list. modelinput
stores general information about your climate data,
i.e. the locations of the NetCDF files and their filenames. It also
safes certain metainformation for the specific climate simulations
(e.g. a unique acronym for the simulation; the developing institution;
the radiative forcing). Usually the modelinput
information
should be stored in a single file on your system and should be updated
when new climate simulations come in. It is advisable to share this
file with your collegues if you work with the same NetCDF files on a
shared IT infrastructure.
userinput
contains information on what you actually want
models2wux
to be doing for you, mainly, which climate
simulations defined in modelinput
should be processed and what
kind of statistic should be performed. You also define the
geographical regions of interest you want to investigate and what time
horizon you want to regard. Here is an overview of all possible tags a
userinput
list contains:
parameter.names |
Specification of parameters to process. |
reference.period |
Specification of the reference period. |
scenario.period |
Specification of the scenario period. |
temporal.aggregation |
Specification of the temporal aggregation of the climate models (e.g. monthly mean or season sum) and indicating if either time series or climate change signals should be created. |
subregions |
Specification of subregions. |
area.fraction |
Take parts of model-pixels according to subregion coverage. |
spatial.weighting |
Cosine areal weighting of regular grid. |
na.rm |
Behavior for missing values of timeslices. |
plot.subregions |
Specifies diagnostic plotting of grid points within the subregions. |
save.as.data |
Specification of output directory and filename. |
climate.models |
Specification of climate models to be processed. |
This is what models2wux
is doing: First, models2wux
extracts attributes set in the userinput
list and loads the
corresponding model information (storage paths, filenames, ...) from
the modelinput
list. It then retrieves the geographical
boundaries of the specified regions in subregions
(here the
model gridfiles are introduced) and reads the specified parameter data
from the NetCDF files within the boundaries of the actual
subregion. Subsequently, models2wux
aggregates over the
time dimension by the indicated months for the specified periods and
calculates either the climatological mean values of the reference and
future period and the according climate change signals or time
series. Next, models2wux
aggregates over the spatial
dimension. models2wux
repeats these processing steps for each
model specified in climate.models
, each parameter in
parameter.names
, each subregion in subregions
, and each
period in reference.period
and scenario.period
,
respectively. Finally, the processed data is written to a
data.frame
and stored to the hard disk as indicated by
save.as.data
.
For more detailed information on modelinput
and
userinput
see the corresponding sections Configfile
"modelinput"
and Configfile "userinput"
in this help page.
modelinput_test
, userinput_CMIP5_changesignal
,
cmip5_2050
, cmip5_2100
,
ensembles
, ensembles_gcms
## This example shows a typical workflow for models2wux, the workhorse of
## the wux package. Going through this example step-by-step, you will
## retrieve NetCDF files of two CMIP5 simulations and aggregate them to
## an R data.frame for further analysis.
## I) Load wux functions and example datasets...
library("wux")
## II) You need to obtain the climate simulations first. You can get
## started with downloading some example CMIP5 NetCDF files from the
## ESGF visiting for example http://pcmdi9.llnl.gov or using the
## CMIP5fromESGF function. Here, we dowload two simulations "NorESM1-M" and
## "CanESM2" into your home directory "~/tmp/CMIP5/" which will be
## created automatically. You will need a valid account at any ESGF
## node for this function to run. See ?CMIP5fromESGF for further help.
## Not run: CMIP5fromESGF(save.to = "~/tmp/CMIP5/",
# models = c("NorESM1-M", "CanESM2"),
# variables = c("tas"),
# experiments= c("historical", "rcp85"))
# ## End(Not run)
## III) Specify those downloaded data for models2wux. models2wux needs
## to know where the data is stored on your HDD and needs to have access
## to certain metadata of the climate simulator, which you have to
## provide as well. This information is stored in a list, which should
## be saved as ONE file somewhere on your computer. We call this
## information "modelinput". You should share this
## file with you collegues using the same IT infrastructure to share
## synergies. You can create such a file based on the data downloade
## by "CMIP5fromESGF":
## Not run: CMIP5toModelinput(filedir = "~/tmp/CMIP5",
# save.to = "~/modelinput.R")
# ## End(Not run)
## This file then would look this:
data(modelinput_test)
## It specifies temperature and precipitation files for the two
## simulations "NorESM1-M" and "CanESM2" (RCP8.5), stored in
## "~/tmp/CMIP5/".
str(modelinput_test)
## IV) Next, you need to specify which simulations you want to read in
## with models2wux, what kind of statistics to calculate, what subregion
## to analyze, what time periods and seasons to define, and so on. This
## is done with a user input file, which cntains a list with all the
## necessary information. You typically use different userinput files
## for different analysis, whereas your modelinput should remain in ONE
## file which will be updated each time you obtain a new climate
## simulation. One example user input file, which reads in both
## simulations specified above for the Alpine domain and returns their
## projected climate change signal, could look like follows:
data(userinput_CMIP5_changesignal)
str(userinput_CMIP5_changesignal)
## alternatively following userinput returns a timeseries of both
## models, which only differs by the "time.series" tag and differently
## specified periods:
data(userinput_CMIP5_timeseries)
str(userinput_CMIP5_timeseries)
## V) At last you can run models2wux to obtain a data.frame of the
## specified climatic change features defined above:
## Not run: climchange.df <- models2wux(userinput = userinput_CMIP5_changesignal,
# modelinput = modelinput_test)## End(Not run)
## A better practice is to safe both input files containing a named
## list each somewhere on your disk and pass the files directly to the
## models2wux function. If you had stored the two files in your home
## directory as e.g. "~/userinput.R" and "~/modelinput.R" you can call:
## Not run: climchange.df <- models2wux(userinput = "~/userinput.R",
# modelinput = "~/modelinput.R")## End(Not run)
## if you downloaded the data correctly, you should obtain a data.frame:
## Not run:
# climchange.df
# ## End(Not run)
## which should be identical to this example data.frame:
data(CMIP5_example_changesignal)
CMIP5_example_changesignal
## Instead of calculating the climate change signals, you can also
## generate time series of the two models aggregated over the Alpine
## domain, using a different user input file:
## Not run: climchange.df <- models2wux(userinput = userinput_CMIP5_timeseries,
# modelinput = modelinput_test)## End(Not run)
## VI) Finally you can make all kind of analysis you are interested in,
## using either functions from wux or from any other R funtionality
summary(CMIP5_example_changesignal, parms = "delta.air_temperature")
## or plot timeseries as
require(lattice)
data(CMIP5_example_timeseries)
## Not run: xyplot(air_temperature ~ year|season,
# groups = acronym,
# data = CMIP5_example_timeseries,
# type = c("l", "g"),
# main = "NorESM1-M and CanESM2 simulations over Alpine Region\nRCP 8.5 forcing")## End(Not run)
Run the code above in your browser using DataLab