February 11, 2025
2025
We released DeepScaleR, a 1.5B model that surpasses o1-preview by scaling RL🔥. Check out our code.