traitExtractFeatures: Extract features, that are single-valued for each individual, from traits observed over time.

Description

Extract one or more sets of features from traits observed over time, the result being traits that have a single value for each individual. The sets of features are:

single times -- the value for each individual for a single time. (uses getTimesSubset)
growth rates for a time interval -- the average growth rate (AGR and/or RGR) over a time interval for each individual. (uses byIndv4Intvl_GRsDiff or byIndv4Intvl_GRsAvg)
water use traits for a time interval -- the total water use (WU), the water use rate (WUR) and the water use index (WUI) over a time interval for each individual. (uses byIndv4Intvl_WaterUse so see its documentation for further details)
growth rates for the imaging period overall -- the average growth rate (AGR and/or RGR) over the whole imaging period for each individual. (uses byIndv4Intvl_GRsDiff or byIndv4Intvl_GRsAvg)
water use traits for the imaging period overall -- the total water use (WU), the water use rate (WUR) and the water use index (WUI) for the whole imaging period for each individual. (uses byIndv4Intvl_WaterUse)
totals for the imaging period overall -- the total over the whole imaging period of a trait for each individual. (uses byIndv4Intvl_ValueCalc)
maximum for the imaging period overall -- the maximum value over the whole imaging period, and the time at which it occurred, for each individual. (uses byIndv4Intvl_ValueCalc)

The Tomato vignette illustrates the use of traitSmooth and traitExtractFeatures to carry out the SET procedure for the example presented in Brien et al. (2020). Use vignette("Tomato", package = "growthPheno") to access it.

Usage

traitExtractFeatures(data, individuals = "Snapshot.ID.Tag", times = "DAP", 
                     starts.intvl = NULL, stops.intvl = NULL, 
                     suffices.intvl = NULL, 
                     responses4intvl.rates = NULL, 
                     growth.rates = NULL, 
                     growth.rates.method = "differences", 
                     suffices.growth.rates = NULL, 
                     water.use4intvl.traits = NULL, 
                     responses4water = NULL, 
                     water.trait.types = c("WU", "WUR", "WUI"), 
                     suffix.water.rate = "R", suffix.water.index = "I", 
                     responses4singletimes = NULL, times.single = NULL, 
                     responses4overall.rates = NULL, 
                     water.use4overall.water = NULL, 
                     responses4overall.water = NULL, 
                     responses4overall.totals = NULL, 
                     responses4overall.max = NULL, 
                     intvl.overall = NULL, suffix.overall = NULL, 
                     sep.times.intvl = "to", sep.suffix.times = ".", 
                     sep.growth.rates = ".", sep.water.traits = "", 
                     mergedata = NULL, ...)

Value

A data.frame that contains an individuals column and a column for each extracted trait, in addition to any columns in mergedata. The number of rows in the data.frame will equal the number of unique element of the individuals column in data, except when there are extra values in the individuals column in data. If the latter applies, then the number of rows will equal the number of unique values in the combined individuals columns from mergedata and

data.

The names of the columns produced by the function are constructed as follows:

single times -- A name for a single-time trait is formed by appending a full stop to an element of responses4singletimes, followed by the value of times at which the values were observed.
growth rates for a time interval -- The name for an interval growth rate is constructed by concatenating the relevant element of responses4intvl.rates, growth.rates and a suffix for the time interval, each separated by a full stop. The interval suffix is formed by joining its starts.intvl and stops.intvl values, separating them by the value of sep.times.intvl.
growth rates for the whole imaging period -- The name for an interval growth rate is constructed by concatenating the relevant element of responses4intvl.rates, growth.rates and suffix.overall, each separated by a full stop.
water use traits for a time interval -- Construction of the names for the three water traits begins with the value of water.use4intvl.traits. The rate (WUR) has either R or the value of suffix.water.rate added to the value of water.use4intvl.traits. Similarly the index (WUI) has either I or the value of suffix.water.index added to it. The WUI also has the element of responses4water used in calculating the WUI prefixed to its name. All three water use traits have a suffix for the interval appended to their names. This suffix is contructed by joining its starts.intvl and stops.intvl, separated by the value of sep.times.intvl.
water use traits for the whole imaging period -- Construction of the names for the three water traits begins with the value of water.use4intvl.traits. The rate (WUR) has either R or the value of suffix.water.rate added to the value of water.use4intvl.traits. Similarly the index (WUI) has either I or the value of suffix.water.index added to it. The WUI also has the element of responses4water used in calculating the WUI prefixed to its name. All three water use traits have suffix.overall appended to their names.
the total for the whole of imaging period -- The name for whole-of-imaging total is formed by combining an element ofresponses4overall.totals with suffix.overall, separating them by a full stop.
maximum for the whole of imaging period -- The name of the column with the maximum values will be the result of concatenating the responses4overall.max, "max" and suffix.overall, each separated by a full stop. The name of the column with the value of times at which the maximum occurred will be the result of concatenating the responses4overall.max, "max" and the value of times, each separated by a full stop.

The data.frame is returned invisibly.

Arguments

data

A data.frame containing the columns specified by individuals, times, the various responses arguments and the water.use argument.

individuals

A character giving the name of the factor that defines the subsets of the data for which each subset corresponds to the response values for an individual (e.g. plant, pot, cart, plot or unit).

times

A character giving the name of the column in data containing the times at which the data was collected, either as a numeric, factor, or character. It will be used identifying the intervals and, if a factor or character, the values should be numerics stored as characters.

starts.intvl

A numeric giving the times, in terms of values in times, that are the initial times for a set of intervals for which growth.rates and water.use traits are to be obtained. These times may also be used to obtain values for single-time traits (see responses4singletimes).

stops.intvl

A numeric giving the times, in terms of values in times, that are the end times for a set of intervals for which growth.rates and water.use traits are to be obtained. These times may also be used to obtain values for single-time traits (see responses4singletimes).

suffices.intvl

A character giving the suffices for intervals specified using starts.intvl and stops.intvl. If NULL, the suffices are automatically generated using starts.intvl, stops.intvl and sep.times.intvl.

responses4intvl.rates

A character specifying the names of the columns containing responses for which growth rates are to be obtained for the intervals specified by starts.intvl and stops.intvl. For growth.rates.method set to differences, the growth rates will be computed from the column of the response values whose name is listed in responses4intvl.rates. For growth.rates.method set to derivatives, the growth rates will be computed from a column with the growth rates computed for each time. The name of the column should be a response listed in responses4intvl.rates to which is appended an element of suffices.growth.rates.

growth.rates.method

A character specifying the method to use in calculating the growth rates over an interval for the responses specified by responses4intvl.rates. The two possibilities are "differences" and "ratesaverages". For differences, the growth rate for an interval is computed by taking differences between the values of a response for pairs of times. For ratesaverage, the growth rate for an interval is computed by taking weighted averages of growth rates for times within the interval. That is, differences operates on the response and ratesaverage operates on the growth rates previously calculated from the response, so that the appropriate one of these must be in data. The ratesaverage option is most appropriate when the growth rates are calculated using the derivatives of a fitted curve. Note that, for responses for which the AGR has been calculated using differences, both methods will give the same result, but the differences option will be more efficient than ratesaverages.

growth.rates

A character specifying which growth rates are to be obtained for the intervals specified by starts.intvl and stops.intvl. It should contain one of both of "AGR" and "RGR".

suffices.growth.rates

A character giving the suffices appended to responses4intvl.rates in constructung the column names for the storing the growth rates specified by growth.rates. If suffices.growth.rates is NULL, then "AGR" and "RGR" will be used.

water.use4intvl.traits

A character giving the names of the columns in data that contain the water use values that are to be used in computing the water use traits (WU, WUR, WUI) for the intervals specified by starts.intvl and stops.intvl. If there is only one column name, then the WUI will be calculated using this name for all column names in responses4water. If there are several column names in water.use4intvl.traits, then there must be either one or the same number of names in responses4water. If both have same number of names, then the two lists of column names will be processed in parallel, so that a single WUI will be produced for each pair of responses4water and water.use4intvl.traits values.

responses4water

A character giving the names of the columns in data that are to provide the numerator in calculating a WUI for the intervals specified using starts.intvl and stops.intvl. The denominator will be the values in the columns in data whose names are those given by water.use4intvl.traits. If there is only one column name in responses4water, then the WUI will be calculated using this name for all column names in responses4water. If there are several column names in responses4water, then there must be either one or the same number of names in water.use4intvl.traits. If both have same number of names, then the two lists of column names will be processed in parallel, so that a single WUI will be produced for each pair of responses4water and water.use4intvl.traits values.

See the Value section for a description of how responses4water is incorporated into the names constructed for the water use traits.

water.trait.types

A character listing the trait types to compute and return. It should be some combination of WU, WUR and WUI. See Details in byIndv4Intvl_WaterUse for how each is calculated.

suffix.water.rate

A character giving the label to be appended to the value of water.use4intvl.traits to form the name of the WUR.

suffix.water.index

A character giving the label to be appended to the value of water.use4intvl.traits to form the name of the WUI.

responses4singletimes

A character specifying the names of the columns containing responses for which a column of the values is to be formed for each response for each of the times values specified in times.single. If times.single is NULL, then the unique values in the combined starts.intvl and stops.intvl will be used.

times.single

A numeric giving the times of imaging, for each of which, the values of each responses4singletimes will be stored in a column of the resulting data.frame. If NULL, then the unique values in the combined starts.intvl and stops.intvl will be used.

responses4overall.rates

A character specifying the names of the columns containing responses for which growth rates are to be obtained for the whole imaging period i.e. the interval specified by intvl.overall. The settings of growth.rates.method, growth.rates, suffices.growth.rates, sep.growth.rates, suffix.overall and intvl.overall will be used in producing the growth rates. See responses4intvl.rates for more information about how these arguments are used.

water.use4overall.water

A logical indicating whether the overall water.traits are to be obtained. The settings of water.trait.types, suffix.water.rate, suffix.water.index, sep.water.traits, suffix.overall and intvl.overall will be used in producing the overall water traits. See water.use4intvl.traits for more information about how these arguments are used.

responses4overall.water

A character giving the names of the columns in data that are to provide the numerator in calculating a WUI for the interval corresponding to the whole imaging period. See response.water for further details. See responses4water for more information about how this argument is processed.

responses4overall.totals

A character specifying the names of the columns containing responses for which a column of the values is to be formed by summing the response for each individual over the whole of the imaging period.

responses4overall.max

A character specifying the names of the columns containing responses for which columns of the values are to be formed for the maximum of the response for each individual over the whole of the imaging period and the times value at which the maximum occurred.

intvl.overall

A numeric giving the starts and stop times of imaging. If NULL, the start time will be the minimum of starts.intvl and the stop time will be the maximum of stops.intvl.

suffix.overall

A character giving the suffix to be appended to the names of traits that apply to the whole imagng period. It applies to overall.growth.rates, water.use4overall.water, responses4overall.water and responses4overall.totals. If NULL, then nothing will be added.

sep.times.intvl

A character giving the separator to use in combining a starts.intvl with a stops.intvl in constructing the suffix to be appended to an interval trait. If set to NULL and there is only one value for each of starts.intvl and stops.intvl, then no suffix will be added; otherwise sep.times.intvl set to NULL will result in an error.

sep.suffix.times

A character giving the separator to use in appending a suffix for times to a trait. For no separator, set to "".

sep.growth.rates

A character giving the character(s) to be used to separate the suffices.growth.rates value from the responses4intvl.rates values in constructing the name for a new rate. It is also used for separating responses4water values from the suffix.water.index. For no separator, set to "".

sep.water.traits

A character giving the character(s) to be used to separate the suffix.rate and suffix.index values from the response value in constructing the name for a new rate/index. The default of "" results in no separator.

mergedata

A data.frame containing a column with the name given in individuals and for which there is only one row for each value given in this column. In general, it will be that the number of rows in mergedata is equal to the number of unique values in the column in data labelled by the value of individuals, but this is not mandatory. If mergedata is not NULL, the values extracted by traitExtractFeatures will be merged with it.

...

allows passing of arguments to other functions; not used at present.

Author

Chris Brien

References

Brien, C., Jewell, N., Garnett, T., Watts-Williams, S. J., & Berger, B. (2020). Smoothing and extraction of traits in the growth analysis of noninvasive phenotypic data. Plant Methods, 16, 36. tools:::Rd_expr_doi("10.1186/s13007-020-00577-6").

Examples

Run this code

 #Load dat
 data(tomato.dat)

 #Define DAP constants 
 DAP.endpts   <- c(18,22,27,33,39,43,51)
 nDAP.endpts <- length(DAP.endpts)
 DAP.starts <- DAP.endpts[-nDAP.endpts]
 DAP.stops   <- DAP.endpts[-1]
 DAP.segs <- list(c(DAP.endpts[1]-1, 39), 
                   c(40, DAP.endpts[nDAP.endpts]))
 #Add PSA rates and smooth PSA, also producing sPSA rates
 tom.dat <- byIndv4Times_SplinesGRs(data = tomato.dat, 
                                    response = "PSA", response.smoothed = "sPSA", 
                                    times = "DAP", rates.method = "differences", 
                                    smoothing.method = "log", 
                                    spline.type = "PS", lambda = 1, 
                                    smoothing.segments = DAP.segs)
  
 #Smooth WU
 tom.dat <- byIndv4Times_SplinesGRs(data = tom.dat, 
                                    response = "WU", response.smoothed = "sWU",
                                    rates.method = "none", 
                                    times = "DAP", 
                                    smoothing.method = "direct", 
                                    spline.type = "PS", lambda = 10^(-0.5), 
                                    smoothing.segments = DAP.segs)
 
 #Extract single-valued traits for each individual
 indv.cols <- c("Snapshot.ID.Tag", "Lane", "Position", "Block", "Cart", "AMF", "Zn")
 indv.dat <- subset(tom.dat, subset = DAP == DAP.endpts[1], 
                    select = indv.cols)
 indv.dat <- traitExtractFeatures(data = tom.dat, 
                                  starts.intvl = DAP.starts, stops.intvl = DAP.stops, 
                                  responses4singletimes = "sPSA", 
                                  responses4intvl.rates = "sPSA", 
                                  growth.rates = c("AGR", "RGR"), 
                                  water.use4intvl.traits = "sWU", 
                                  responses4water = "sPSA", 
                                  responses4overall.totals = "sWU",
                                  responses4overall.max = "sPSA.AGR",
                                  mergedata = indv.dat)

Run the code above in your browser using DataLab