This work investigates how large reasoning models internally track their thinking progress and how such processes can be monitored and controlled. We focus on reasoning models that explicitly segment their computations using <think> and </think> tokens (e.g., DeepSeek-R1), allowing us to study the internal dynamics of the "thinking phase." 1. Monitoring the Thinking Phase We hypothesize that hidden states encode a token's relative position within the thinking phase. To test this, we collect hidden representations from the final layer of the model for each token in a thinking trajectory $T = w_1w_2...w_N$. Each token is paired with a normalized position: $$p_j^{(k)} = j / N_k$$ This creates a dataset $D = \{ (h_j^{(k)}, p_j^{(k)}) \}$, where $h \in \mathbb{R}^d$ is the hidden state and $p \in (0, 1]$ is the relative position. We learn a regression function: $$\theta^* = \arg\min_\theta \sum (f_\theta(h) - p)^2$$ We compare a linear regressor (TPV: Thinking Progress Vector) with a 2-layer FFN and find no improvement from the latter, favoring the simpler TPV model. For improved temporal modeling, we also train a single-layer GRU on full token sequences: $$D' = \{ (h_1, ..., h_N), (p_1, ..., p_N) \}$$ The GRU outperforms TPV, especially in generalizing from MATH-500 to GSM8K in both fine-tuned and zero-shot setups. 2. Controlling the Thinking Phase To explore whether TPVs are causally involved in reasoning, we intervene on hidden states during decoding: $$h^\alpha = h + \alpha\theta \quad \rightarrow \quad \theta^T h^\alpha = \theta^T h + \alpha||\theta||^2$$ This intervention occurs after attention layers to isolate its effect to a single token step. We refer to this manipulation as "overclocking" when $\alpha > 0$. Empirically, overclocking results in more concise and decisive reasoning while maintaining correctness. Findings Original trajectories often exhibit repetition and hesitation. Overclocked outputs are shorter and more linear in progress prediction. In some c...
First seen: 2025-07-06 18:24
Last seen: 2025-07-07 03:25