2026-01-29

SimBa - RL

[pdf][arXiv]

Ideas new to me that I found in the paper:

  1. Use of Fourier analysis as complexity measure.
    • complexity measure is frequency-weighted average of fourier coefficients

      Higher frequency components in the function output means that the functions is learning complex patterns

  2. Flatter minima are associated with functions of lower complexity thereby improving generalization.
  3. Initially it (preference towards simplicity) was associated with SGD but architectural components like normalization, ReLU, residiual connections also promote simplicity.
  4. Reinitialization technique. Periodically reset the whole network and optimizer. But retain the collected data.

Intresting stuff:

  1. Scaling up actor network showed limited benefits, so they test scaling critic network only.

    They think, it suggests that the target complexity of the actor is lower than that of the critic. [pg. 9]


You can send your feedback, queries here