Login
Sign Up
Woofun AI data shows that the open-source GLM 5.2 model achieved significantly lower expenses during academic reproducibility tests conducted by the alphaXiv research platform. In an experiment involving the SDPO paper, GLM 5.2 successfully reproduced results after 14 attempts, consuming 2.65 million tokens at a total cost of $6.21.
In contrast, the closed-source Claude Opus 4.8 Max required 9 failures before succeeding, using 4.53 million tokens and incurring costs of $46.35. The test required autonomous paper reading, troubleshooting VeRL library errors, and completing ablation studies.