Function uses an environment function to generate sample experience in the form of state transition tuples.
sampleGridSequence(N, actionSelection = "random", control = list(alpha
= 0.1, gamma = 0.1, epsilon = 0.1), model = NULL, ...)
Number of samples.
(optional) Defines the action selection mode of the reinforcement learning agent. Default: random
.
(optional) Control parameters defining the behavior of the agent.
Default: alpha = 0.1
; gamma = 0.1
; epsilon = 0.1
.
(optional) Existing model of class rl
. Default: NULL
.
Additional parameters passed to function.
An dataframe
containing the experienced state transition tuples s,a,r,s_new
.
The individual columns are as follows:
State
The current state.
Action
The selected action for the current state.
Reward
The reward in the current state.
NextState
The next state.