Learn R Programming

pomdp (version 1.0.0)

reward: Calculate the Reward for a POMDP Solution

Description

This function calculates the expected total reward for a POMDP solution given a starting belief state.

Usage

reward(x, belief = NULL, epoch = 1)

Arguments

x

a solved POMDP object.

belief

specification of the current belief state (see argument start in POMDP for details). By default the belief state defined in the model as start is used.

epoch

return reward for this epoch. Default is the first epoch.

Value

A list with the components

reward

the total expected reward given a belief and epoch.

belief_state

the belief state specified in belief.

pg_node

the policy node that represents the belief state.

action

the optimal action.

Details

The value is calculated using the value function stored in the POMDP solution.

See Also

Other policy: optimal_action(), plot_policy_graph(), plot_value_function(), policy_graph(), policy(), solve_POMDP(), solve_SARSOP()

Examples

Run this code
# NOT RUN {
data("Tiger")
sol <- solve_POMDP(model = Tiger)

# if no start is specified, a uniform belief is used.
reward(sol)

# we have additional information that makes us believe that the tiger
# is more likely to the left.
reward(sol, belief = c(0.85, 0.15))

# we start with strong evidence that the tiger is to the left.
reward(sol, belief = "tiger-left")

# Note that in this case, the total discounted expected reward is greater
# than 10 since the tiger problem resets and another game staring with
# a uniform belief is played which produces additional reward.
# }

Run the code above in your browser using DataLab