base_pref: Base Preferences

Description

Base preferences are used to describe the different goals (e.g., dimensions in case of a Skyline query) of a preference query.

Usage

low(expr, df = NULL)
low_(expr, df = NULL)
high(expr, df = NULL)
high_(expr, df = NULL)
true(expr, df = NULL)
true_(expr, df = NULL)

Arguments

expr

A numerical/logical expression which is the term to evaluate for the current preference. The objective is to search for minimal/maximal values of this expression (for low/high) or for logical TRUE values (for t

(optional) A data frame, having the same structure (i.e., columns) like that data frame, where this preference is evaluated later on. Causes a partial evaulation of the preference. Only the column names of df are relevant.

Partial Evaluation of Preferences

If the optional parameter df is given, then the expression is evaluated at the time of definition as far as possible. All variables occuring as columns in df remain untouched. For example, consider

f <- function(x) 2*x p <- true(cyl == f(1), mtcars)

This results in the preference true(cyl = 2) as the variable cyl is a column in mtcars. The rows of df are not relevant, e.g., using mtcars[0,] instead of mtcars makes no difference.

The preference selection, i.e., psel(mtcars, p) can be also done without the partial evaluation. But this results in an error, if the function f has meanwhile removed from the current environment. Hence it is safer to do an eary partial evaluation of all preferences using user defined functions.

The partial evaluation can be done manually by eval.pref.

Using Expressions in Preferences

The low_, high_ and true_ preferences have the same functionality as low, high and true but expect an expression e or symbol e as argument. For example, low(a) is equivalent to low_(expression(a)) or low_(as.symbol("a")).

This is very helpful for developing your own base preferences. Assume you want to define a base Preference false as the dual of true. A definition like false <- function(x) -true(x) is the wrong approach, as psel(data.frame(a = c(1,2)), false(a == 1)) will result in the error "object 'a' not found". This is because a is considered as a variable and not as an (abstract) symbol to be evaluated later. By defining

false <- function(x, ...) -true_(substitute(x), ...)

one gets a preference which behaves like a "built-in" preference. Additional optional paramters (like df) are bypassed. The object false(a == 1) will output [Preference] -true(a == 1) on the console and psel(data.frame(a = c(1,2)), false(a==1)) returns correctly the second tuple with a==2.

There is a special symbol df__ which can be used in preference expression to access the given data set df, when psel is called on this data set. For example, on a data set where the first column has the name A the preference low(df__[[1]]) is equivalent to low(A).

Details

Mathematically, all base preferences are strict weak orders (irreflexive, transitive and negative transitive).

The three fundamental base preferences are:

[object Object],[object Object]

The term expr may be just a single attribute or may contain an arbitrary expression, depending on more than one attribute, e.g., low(a+2*b+f(c)). There a, b and c are columns of the addressed data set and f has to be a previously defined function.

Functions contained in expr are evaluated over the entire dataset, i.e., it is possible to use aggregate functions (min, mean, etc.). Note that all functions (and also variables which are not columns of the data set, where expr will be evaluated on) must be defined in the same environment (e.g., environment of a function or global environment) as the base preference is defined.

Examples

Run this code

# define a preference with a score value combining mpg and hp
p1 <- high(4 * mpg + hp)
# Perform the preference selection
psel(mtcars, p1)

# define a preference with a given function
f <- function(x, y) (abs(x - mean(x))/max(x) + abs(y - mean(y))/max(y))
p2 <- low(f(mpg, hp))
psel(mtcars, p2)

Run the code above in your browser using DataLab