Performs experience replay. Experience replay allows reinforcement learning agents to remember and reuse experiences from the past. The algorithm requires input data in the form of sample sequences consisting of states, actions and rewards. The result of the learning process is a state-action table Q that allows one to infer the best possible action in each state.
replayExperience(D, Q, control, ...)
A dataframe
containing the input data for reinforcement learning.
Each row represents a state transition tuple (s,a,r,s_new)
.
Existing state-action table of type hash
.
Control parameters defining the behavior of the agent.
Additional parameters passed to function.
Returns an object of class hash
that contains the learned Q-table.
Lin (1992). "Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching", Machine Learning (8:3), pp. 293--321.
Watkins (1992). "Q-learning". Machine Learning (8:3), pp. 279--292.