Learn R Programming

RDruid (version 0.2.3)

druid.query.timeseries: Query time series data

Description

Queries druid for timeseries data and returns it as a data frame

Usage

druid.query.timeseries(url = druid.url(), dataSource, intervals, aggregations, filter = NULL, granularity = "all", postAggregations = NULL, context = NULL, rawData = FALSE, verbose = F, ...)

Arguments

url
URL to connect to druid, defaults to druid.url()
dataSource
name of the data source to query
intervals
time period to retrieve data for as an interval object or list of interval objects
aggregations
list of metric aggregations to compute for this datasource
filter
filter specifying the subset of the data to extract.
granularity
time granularity at which to aggregate
postAggregations
post-aggregations to perform on the aggregations
context
query context
rawData
if set, returns the result object as is, without converting to a data frame
verbose
prints out the JSON query sent to druid
...
additional parameters to pass to druid.resulttodf

Value

Returns a data frame where each column represents a time series

See Also

druid.query.groupBy druid.query.topN granularity

Examples

Run this code
## Not run: 
# 
#    # Get the time series associated with the twitter hashtag #druid, by hour
#    druid.query.timeseries(url = druid.url(host = "<hostname>"),
#                          dataSource   = "twitter",
#                          intervals    = interval(ymd("2012-07-01"), ymd("2012-07-15")),
#                          aggregations = sum(metric("count")),
#                          filter       = dimension("hashtag") == "druid",
#                          granularity  = granularity("hour"))
# 
#    # Average tweet length for a combination of hashtags in a given time zone
#    druid.query.timeseries(url = druid.url("<hostname>"),
#                          dataSource   = "twitter",
#                          intervals    = interval(ymd("2012-07-01"), ymd("2012-08-30")),
#                          aggregations = list(
#                                            sum(metric("count")),
#                                            sum(metric("length")
#                                         ),
#                          postAggregations = list(
#                                            avg_length = field("length") / field("count")
#                                         )
#                          filter       =   dimension("hashtag") == "london2012"
#                                         | dimension("hashtag") == "olympics",
#                          granularity  = granularity("PT6H", timeZone="Europe/London"))
#   ## End(Not run)

Run the code above in your browser using DataLab