Replay Buffer
Without Replay Buffer, SAC, PPO, … methods fail. For the streaming setting, i.e. setting where we can't have replay buffer, we need different algorithm. Action Value Gradient (AVG) works without replay buffer and batches.
Without Replay Buffer, SAC, PPO, … methods fail. For the streaming setting, i.e. setting where we can't have replay buffer, we need different algorithm. Action Value Gradient (AVG) works without replay buffer and batches.