Learn R Programming

featForge (version 0.1.2)

extract_timestamp_features: Extract Timestamp Features

Description

This function extracts various features from application timestamps and, if provided, client dates of birth. It supports both POSIXct and Date objects for timestamps. If the timestamps are given as Date objects, note that features requiring intra-day granularity (e.g., timestamp_hour) will not be created, and some cyclic features may be less precise.

Usage

extract_timestamp_features(
  timestamps,
  date_of_birth = NULL,
  error_on_invalid = FALSE
)

Value

A data frame with the extracted timestamp features and birthday-related features (if date_of_birth is provided).

Arguments

timestamps

A vector of timestamps, either as POSIXct or Date objects.

date_of_birth

An optional vector of client dates of birth. If provided, it must have the same length as timestamps, enabling computation of age and birthday-related features.

error_on_invalid

Logical flag specifying whether to throw an error (TRUE) or a warning (FALSE, default) when missing or invalid timestamp values are detected.

Details

The function returns a data frame containing the following variables:

timestamp_month

Numeric. Month extracted from the timestamp (1 to 12).

timestamp_month_sine

Numeric. Sine transformation of the month (using period = 12).

timestamp_month_cosine

Numeric. Cosine transformation of the month (using period = 12).

timestamp_day_of_month

Numeric. Day of the month extracted from the timestamp (1 to 31).

timestamp_day_of_month_sine

Numeric. Sine transformation of the day of the month (using period = 31).

timestamp_day_of_month_cosine

Numeric. Cosine transformation of the day of the month (using period = 31).

timestamp_week_of_year

Numeric. ISO week number extracted from the timestamp (typically 1 to 52, but may be 53 in some years).

timestamp_week_of_year_sine

Numeric. Sine transformation of the week of the year (using period = 52).

timestamp_week_of_year_cosine

Numeric. Cosine transformation of the week of the year (using period = 52).

timestamp_day_of_week

Numeric. Day of the week extracted from the timestamp (1 for Monday through 7 for Sunday).

timestamp_day_of_week_sine

Numeric. Sine transformation of the day of the week (using period = 7).

timestamp_day_of_week_cosine

Numeric. Cosine transformation of the day of the week (using period = 7).

timestamp_hour

Numeric. Hour of the day (0 to 23). This is only available if timestamps are of class POSIXct.

timestamp_hour_sine

Numeric. Sine transformation of the hour (using period = 24).

timestamp_hour_cosine

Numeric. Cosine transformation of the hour (using period = 24).

client_age_at_application

Numeric. Client's age at the time of application, calculated in years (real number).

days_to_birthday

Numeric. Number of days until the client's next birthday.

days_to_birthday_cosine

Numeric. Cosine transformation of the days to birthday (using period = 365).

The function first validates the inputs and then extracts the following features:

  • Extracts date-based features such as year, month, day of month, ISO week of the year, and day of the week.

  • If timestamps are of class POSIXct, it also extracts the hour of the day.

  • Applies cyclic transformations (using sine and cosine functions) to the month, day of month, week of year, day of week, and hour variables so that their cyclical nature is maintained in machine learning models.

  • If date_of_birth is provided, computes the client's age at the time of application and the number of days until their next birthday, along with a cosine transformation for the latter.

Examples

Run this code
# Load sample data
data(featForge_sample_data)

# Generate features and combine with the original dataset
result <- cbind(
  data.frame(
    application_created_at = featForge_sample_data$application_created_at,
    client_date_of_birth = featForge_sample_data$date_of_birth
  ),
  extract_timestamp_features(
    featForge_sample_data$application_created_at,
    featForge_sample_data$date_of_birth
  )
)

head(result)

Run the code above in your browser using DataLab