Learn R Programming

spinifex (version 0.1.0)

weather: Sample dataset of daily weather observations from Canberra airport in Australia.

Description

A subset from rattle.data::weather, instructions to reproduce below.

Usage

weather

Arguments

Format

Data frame (tibble) of 366 observations of 20 variables, one year of daily observations of weather variables at Canberra airport in Australia starting November 2007.

Details

One year of daily weather observations collected from the Canberra airport in Australia was obtained from the Australian Commonwealth Bureau of Meteorology and processed to create this sample dataset for illustrating data mining using R and Rattle.

The data has been processed to provide a target variable RainTomorrow (whether there is rain on the following day - No/Yes) and a risk variable `RISK_MM`` (how much rain recorded in millimeters). Various transformations were performed on the source data. The dataset is quite small and is useful only for repeatable demonstration of various data science operations.

The source dataset is Copyright by the Australian Commonwealth Bureau of Meteorology and is provided as part of the rattle package with permission.

Data frame (tibble) of 366 observations of 20 variables, one year of daily observations of weather variables at Canberra airport in Australia starting November 2007:

  • Date, The date of observation (a Date object).

  • MinTemp, The minimum temperature in degrees Celsius.

  • MaxTemp, The maximum temperature in degrees Celsius.

  • Rainfall, The amount of rainfall recorded for the day in mm.

  • Evaporation, The so-called Class A pan evaporation (mm) in the 24 hours to 9am.

  • Sunshine, The number of hours of bright sunshine in the day.

  • WindGustSpeed, The speed (km/h) of the strongest wind gust in the 24 hours to midnight.

  • WindSpeed9am, Wind speed (km/hr) averaged over 10 minutes prior to 9am.

  • WindSpeed3pm, Wind speed (km/hr) averaged over 10 minutes prior to 3pm.

  • Humid9am, Relative humidity (percent) at 9am.

  • Humid3pm, Relative humidity (percent) at 3pm.

  • Pressure9am, Atmospheric pressure (hpa) reduced to mean sea level at 9am.

  • Pressure3pm, Atmospheric pressure (hpa) reduced to mean sea level at 3pm.

  • Cloud9am, Fraction of sky obscured by cloud at 9am. This is measured in "oktas", which are a unit of eigths. It records how many eigths of the sky are obscured by cloud. A 0 measure indicates completely clear sky whilst an 8 indicates that it is completely overcast.

  • Cloud3pm, Fraction of sky obscured by cloud (in "oktas": eighths) at 3pm. See Cload9am for a description of the values.

  • Temp9am, Temperature (degrees C) at 9am.

  • Temp3pm, Temperature (degrees C) at 3pm.

  • RainToday, Integer: 1 if precipitation (mm) in the 24 hours to 9am exceeds 1mm, otherwise 0.

  • RISK_MM, The amount of rain. A kind of measure of the "risk".

  • RainTomorrow, The target variable. Did it rain tomorrow?

Reproducing this dataset:

library("rattle.data")
weather <- dplyr::as.tibble(weather[, c(1,3:7,9,12:24)])

References

Data source: http://www.bom.gov.au/climate/dwo/ and http://www.bom.gov.au/climate/data.

Examples

Run this code
# NOT RUN {
str(weather)
# }
# NOT RUN {
play_manual_tour(data = weather[, 2:17], manip_var = 5, init_rescale_data = TRUE)
# }

Run the code above in your browser using DataLab