This function extracts various features from application timestamps and, if provided, client dates of birth.
It supports both POSIXct
and Date
objects for timestamps
. If the timestamps are given as Date
objects,
note that features requiring intra-day granularity (e.g., timestamp_hour
) will not be created, and some cyclic features may be less precise.
extract_timestamp_features(
timestamps,
date_of_birth = NULL,
error_on_invalid = FALSE
)
A data frame with the extracted timestamp features and birthday-related features (if date_of_birth
is provided).
A vector of timestamps, either as POSIXct
or Date
objects.
An optional vector of client dates of birth. If provided, it must have the same length as timestamps
,
enabling computation of age and birthday-related features.
Logical flag specifying whether to throw an error (TRUE
) or a warning (FALSE
, default)
when missing or invalid timestamp values are detected.
The function returns a data frame containing the following variables:
Numeric. Month extracted from the timestamp (1 to 12).
Numeric. Sine transformation of the month (using period = 12).
Numeric. Cosine transformation of the month (using period = 12).
Numeric. Day of the month extracted from the timestamp (1 to 31).
Numeric. Sine transformation of the day of the month (using period = 31).
Numeric. Cosine transformation of the day of the month (using period = 31).
Numeric. ISO week number extracted from the timestamp (typically 1 to 52, but may be 53 in some years).
Numeric. Sine transformation of the week of the year (using period = 52).
Numeric. Cosine transformation of the week of the year (using period = 52).
Numeric. Day of the week extracted from the timestamp (1 for Monday through 7 for Sunday).
Numeric. Sine transformation of the day of the week (using period = 7).
Numeric. Cosine transformation of the day of the week (using period = 7).
Numeric. Hour of the day (0 to 23). This is only available if timestamps
are of class POSIXct
.
Numeric. Sine transformation of the hour (using period = 24).
Numeric. Cosine transformation of the hour (using period = 24).
Numeric. Client's age at the time of application, calculated in years (real number).
Numeric. Number of days until the client's next birthday.
Numeric. Cosine transformation of the days to birthday (using period = 365).
The function first validates the inputs and then extracts the following features:
Extracts date-based features such as year, month, day of month, ISO week of the year, and day of the week.
If timestamps are of class POSIXct
, it also extracts the hour of the day.
Applies cyclic transformations (using sine and cosine functions) to the month, day of month, week of year, day of week, and hour variables so that their cyclical nature is maintained in machine learning models.
If date_of_birth
is provided, computes the client's age at the time of application and the number of days
until their next birthday, along with a cosine transformation for the latter.
# Load sample data
data(featForge_sample_data)
# Generate features and combine with the original dataset
result <- cbind(
data.frame(
application_created_at = featForge_sample_data$application_created_at,
client_date_of_birth = featForge_sample_data$date_of_birth
),
extract_timestamp_features(
featForge_sample_data$application_created_at,
featForge_sample_data$date_of_birth
)
)
head(result)
Run the code above in your browser using DataLab