Learn R Programming

IsoriX (version 0.6)

queryGNIP: Filter the dataset to create an isoscape

Description

This function prepares the available GNIP data (e.g. GNIPDataDE) to be used for creating the isoscape. This function allows the trimming of data by months, years and location, and for the aggregation of selected data per location, location:month combination or location:year combination. The function can also be used to randomly exclude some observations.

Usage

queryGNIP(data, month = 1:12, year, long.min, long.max, lat.min, lat.max,
  split.by = NULL, prop.random = 0, random.level = "station")

Arguments

data

A dataframe containing original isotopic measurements similar in structure to GNIPDataDE

month

A numeric vector indicating the months to select from. Should be a vector of round numbers between 1 and 12. The default is 1:12 selecting all months.

year

A numeric vector indicating the years to select from. Should be a vector of round numbers. The default is to select all years available.

long.min

A numeric indicating the minimum longitude to select from. Should be a number between -180 and 180. If not provided, -180 will be considered.

long.max

A numeric indicating the maximal longitude to select from. Should be a number between -180 and 180. If not provided, 180 will be considered.

lat.min

A numeric indicating the minimum latitude to select from. Should be a number between -90 and 90. If not provided, -90 will be considered.

lat.max

A numeric indicating the maximal latitude to select from. Should be a number between -90 and 90. If not provided, 90 will be considered.

split.by

A string indicating whether data should be aggregated per location (split.by = NULL, the default), per location:month combination (split.by = "month"), or per location:year combination (split.by = "year").

prop.random

A numeric indicating the proportion of observations or weather stations (depending on the argument for random.level) that will be kept. If prop.random is greater than 0, then the function will return a list containing two dataframes: one containing the selected data, called selected.data, and one containing the remaining data, called remaining.data.

random.level

A string indicating the level at which random draws can be performed. The two possibilities are "obs", which indicates that observations are randomly drawn taken independently of their location, or "station" (default), which indicates that observations are randomly drawn at the level of weather stations.

Value

This function returns a dataframe containing the filtered data aggregated by weather station, or a list, see above argument prop.random. For each weather station the mean and variance sample estimates are computed.

Details

This function aggregates the data as required for the IsoriX workflow. Three aggregation schemes are possible. The most simple one, used as default, aggregates the data so to obtained a single row per weather station. Datasets prepared in this way can be readily fitted with the function isofit to build an isoscape. It is also possible to aggregate data in a different way in order to build sub-isoscapes representing temporal variation in isotope composition, or in order to produce isoscapes weighted by the amount of precipitation. The two possible options are to either split the data from each weather station by month or to split them by year. This is set with the split.by argument of the function. Datasets prepared in this way should be fitted with the function isomultifit.

The function also allows the user to filter the weather station data (GNIPDataDE) based on time (years and/ or months) and space (locations given in geographic coordinates, i.e. longitude and latitude) to calculate tailored isoscapes matching e.g. the time of sampling and speeding up the model fit by cropping/clipping a certain area. The dataframe produced by this function can be used as input to fit the isoscape (see isofit and isomultifit).

See Also

IsoriX for the complete workflow

GNIPDataDE for the complete dataset

Examples

Run this code
# NOT RUN {
## Create a processed dataset for Germany
GNIPDataDEagg <- queryGNIP(data = GNIPDataDE)

head(GNIPDataDEagg)

## Create a processed dataset for Germany per month
GNIPDataDEmonthly <- queryGNIP(data = GNIPDataDE,
                               split.by = "month")

head(GNIPDataDEmonthly)

## Create a processed dataset for Germany per year
GNIPDataDEyearly <- queryGNIP(data = GNIPDataDE,
                              split.by = "year")

head(GNIPDataDEyearly)

## Create isoscape-dataset for warm months in germany between 1995 and 1996
GNIPDataDEwarm <- queryGNIP(data = GNIPDataDE,
                            month = 5:8,
                            year = 1995:1996)

head(GNIPDataDEwarm)


## Create a dataset with 90% of obs
GNIPDataDE90pct <- queryGNIP(data = GNIPDataDE,
                             prop.random = 0.9,
                             random.level = "obs")

lapply(GNIPDataDE90pct, head) # show beginning of both datasets

## Create a dataset with half the weather stations
GNIPDataDE50pctStations <- queryGNIP(data = GNIPDataDE,
                                     prop.random = 0.5,
                                     random.level = "station")

lapply(GNIPDataDE50pctStations, head)


## Create a dataset with half the weather stations split per month
GNIPDataDE50pctStationsMonthly <- queryGNIP(data = GNIPDataDE,
                                            split.by = "month",
                                            prop.random = 0.5,
                                            random.level = "station")

lapply(GNIPDataDE50pctStationsMonthly, head)

# }

Run the code above in your browser using DataLab