Discovered paper pair (Session 38). Detailed explanation not available.
Phase transition in optimal strategy based on task difficulty regime. System exhibits qualitatively different behaviors (power decay vs warmup-stable-decay) separated by sharp threshold determined by ratio of signal learning rate to noise forgetting capacity. Easy-task regime benefits from gradual decay, hard-task regime requires maintaining peak resources until near completion.
view paper→Dual-signal adaptive resource allocation using entropy as proxy for learning value. Global level: relative entropy change identifies high-value samples for prioritized exploration. Local level: absolute entropy peaks identify critical moments within each trajectory. Concentrates computation where information gain is highest rather than uniform distribution.
view paper→