We released DeepScaleR, a 1.5B model that surpasses o1-preview by scaling RL🔥. Check out our code.