Learn R Programming

seas (version 0.2-1)

mkfact: Make date into a time factor

Description

Discretizes a date within a year into a bin (or factor) for analysis, such as 11-day groups or by month.

Usage

# normal usage
mkfact(dat, width)

# dat is an integer Julian day and width is non-numeric mkfact(dat, width, year)

Arguments

dat
data.frame with at least a date column (Date or POSIXct class).

It can also be an integer specifying the Julian day (specify

width
One of many options; usually specifies the number of days in each bin (default is 11 days), but can also use "mon" for months; see details below.
year
Required if dat is omitted, or if dat is a Julian day integer and width is non-numeric; used to calculate leap year.

Value

  • Returns an array of factors for each date given in dat. See examples for its application.

synopsis

mkfact(dat, width = 11, year)

Locale Warning

Month names generated using "mon" or "months" are locale specific, and depend on your operating system and system language settings. Normally, abbreviated month names should have exactly three characters or less, with no trailing decimals. However, Microsoft-based operating systems have an inconsistent set of abbreviated month names between locales. For example, abbreviated month names in English locales have three letters with no period at the end, while French locales have 3--4 letters with a decimal at the end. If your OS is POSIX, you should have consistent month names in any locale. To avoid any issues supporting locales, simply revert to a C locale (i.e. Sys.setlocale(loc="C"))

Details

This useful date function groups days of a year into discrete bins (or into a factor). Statistical and plotting functions can be applied to a variable contained within each bin. An example of this would be to find the monthly temperature averages, where month is the bin. If width is integer, the width of each bin (except for the last) will be exactly width days. Since the number of days in a year are not consistent, nor are always perfectly divisible by width, the numbers of days in the last bin will vary. mksub determines that last bin must have at least 20% of the number of observations for a leap year, otherwise it is merged into the second to last bin (which will have extra numbers of days). If width is numeric (i.e. 366/12), the width of each bin varies slightly. Using width = 366/12 is slightly different than width = "mon". Leap years only affect the last bin. Other common classifications based on the Gregorian calendar can be used if width is given a character array. All of these systems are arbitrary: having different numbers of days in each bin, and leap years affecting the number of days in February. The most common, of course, is by month ("mon"). Meteorological quarterly seasons ("DJF") are based on grouping three months, starting with December. This style of grouping is commonly used in climate literature, and is preferred over the season names winter, spring, summer, and autumn, which apply to only one hemisphere. The less common annual quarterly divisions ("JFM") are similar, except that grouping begins with January. Zodiac divisions ("zod") are included for demonstrative purposes, and are based on the Tropical birth dates (common in Western-culture horoscopes) starting with Aries (March 21).

Here are the complete list of options for the width argument:

  • numeric: the width of each bin (or group) in days
  • 366/n: divide the year intonsections
  • "mon": month intervals (abbreviated month names)
  • "month": month intervals (full month names)
  • "DJF": meteorological quarterly divisions: DJF, MAM, JJA, SON
  • "JFM": annual quarterly divisions: JFM, AMJ, JAS, OND
  • "JF": annual six divisions: JF, MA, AJ, JA, SO, ND
  • "zod": zodiac intervals (abbreviated symbol names)
  • "zodiac": zodiac intervals (full zodiac names) %\item \code{"zod.s"}: zodiac intervals (symbols; requires \R Unicode support)

References

http://en.wikipedia.org/wiki/Solar_calendar

See Also

plot.seas.temp, seas.sum

Examples

Run this code
# Demonstrate the number of days in each category
barplot(table(mkfact(width="mon", y=2005)),
  main="Number of days in each month")

barplot(table(mkfact(width="zod", y=2005)),
  main="Number of days in each zodiac sign")
barplot(table(mkfact(width="DJF", y=2005)),
  main="Number of days in each meteorological season")

barplot(table(mkfact(width=5, y=2005)),
  main="Number of days in 5-day categories")

barplot(table(mkfact(width=11, y=2005)),
  main="Number of days in 11-day categories")

barplot(table(mkfact(width=366/12, y=2005)),
  main="Number of days in 12-section year",
  sub="Note: not exactly the same as months")

# Application using synthetic data
dat <- data.frame(date=as.Date(paste(2005,1:365),"%Y %j"),
  value=(-cos(1:365*2*pi/365)*10+rnorm(365)*3+10))


dat$d5 <- mkfact(dat,5)
dat$d11 <- mkfact(dat,11)
dat$month <- mkfact(dat,"mon")
dat$DJF <- mkfact(dat,"DJF")

plot(value ~ date, dat)
plot(value ~ d5, dat)
plot(value ~ d11, dat)
plot(value ~ month, dat)
plot(value ~ DJF, dat)

print(head(dat))

tapply(dat$value, dat$month, mean, na.rm=TRUE)
tapply(dat$value, dat$DJF, mean, na.rm=TRUE)

Run the code above in your browser using DataLab