2025-04-28
T2FNorm
Action Value Gradient
also does penultimate layer. And ablation shows it is a key ingredient for the algorithm's success.
You can send your feedback, queries
here