Monte Carlo Tree Search

PIMC : Perfect Information Monte-Carlo Search
ISMCTS : Information set MCTS (MCTS for games with Hidden Information and Uncertainty pg. 81)

Problem of strategy fusion (MCTS for games with Hidden Information and Uncertainty. pg. 86) with PIMC

1. Lessons from Hierarchical Reinforcement Learning

Some MCTS variants incorporate hierarchical concepts that make them functionally similar to HRL:

Macro-actions in MCTS:
- Instead of expanding one primitive action per edge, MCTS expands macro-actions (a fixed sequence of actions or an option policy).
- Equivalent to HRL’s low-level policy being executed inside one tree edge.
- Example: Hierarchical MCTS (H-MCTS) for robotics or RTS games.
Task Decomposition via Subgoals:
- HRL defines subgoal states and plans toward them.
- In MCTS, you can insert milestone nodes and only expand between milestones. This reduces effective branching, similar to HRL’s temporal abstraction.
Abstract MCTS Layers:
- Some planners run MCTS at a coarse abstraction to decide high-level strategies, and a separate MCTS for fine-level control.
- Example: Planning in StarCraft II where high-level tree nodes correspond to build orders, and low-level tree nodes are tactical micro-actions.