Login
Sign Up
Woofun AI reports that Unsloth has released a GGUF version of SmartSpectrum AI's 753 billion parameter GLM-5.2 model, utilizing dynamic quantization to reduce the original 1.51 TB size by over 80%. The compressed variants range from 217 GB for the 1-bit version to 239 GB for the 2-bit UD-IQ2_M variant, facilitating offline deployment on single Mac Studio units.
Performance benchmarks on a Mac Studio M3 Ultra with 256 GB unified memory show an inference speed of 21.6 tokens per second while preserving 76% to 82% of the original model's accuracy. In comparative tests, the locally run 1-bit GLM-5.2 generated a complete HTML5 game with quality comparable to Claude 4.8 Opus and GPT-5.5. The GLM-5.2 GGUF weights are now available on Hugging Face for execution via llama.cpp or Unsloth Studio.