Function uses an environment function to generate sample experience in the form of state transition tuples.
sampleGridSequence(N, actionSelection = "random", control = list(alpha
= 0.1, gamma = 0.1, epsilon = 0.1), model = NULL, ...)Number of samples.
(optional) Defines the action selection mode of the reinforcement learning agent. Default: random.
(optional) Control parameters defining the behavior of the agent.
Default: alpha = 0.1; gamma = 0.1; epsilon = 0.1.
(optional) Existing model of class rl. Default: NULL.
Additional parameters passed to function.
An dataframe containing the experienced state transition tuples s,a,r,s_new.
The individual columns are as follows:
StateThe current state.
ActionThe selected action for the current state.
RewardThe reward in the current state.
NextStateThe next state.