Alibaba Deploys Qwen-Robot Suite for Zero-Shot Physical Action Alignment and Multi-Domain Generalization

2026-06-16 14:26

Per Woofun AI, Alibaba Big Model Team has deployed the Qwen-Robot Suite, an embodied intelligence base model collection designed to align visual-language capabilities with multi-domain physical actions. The suite comprises three specialized models: Qwen-RobotNav for navigation, Qwen-RobotManip for manipulation, and Qwen-RobotWorld for world simulation, collectively enabling multi-tasking and generalization across diverse robotic platforms.

Qwen-RobotNav, trained on 15.6 million samples, parameterizes visual attention strategies to dynamically adjust token budgets during inference. It has achieved state-of-the-art performance in five navigation domains and supports zero-shot deployment on the Yushu Go2 quadruped robot.

Concurrently, Qwen-RobotManip utilizes a Qwen3.5-4B VL backbone and flow-matching DiT action head, processing over 38,100 hours of training data to attain a 91.4% success rate in LIBERO-Plus evaluations.

Meanwhile, Qwen-RobotWorld employs a 60-layer dual-stream MMDiT structure to couple semantic representations with video latent variables, ranking first in physical law compliance benchmarks like EWMBench after training on 8.6 million video-text pairs. All models operate via a language-first interface, integrated within the Qwen-RobotClaw framework to allow upper-level planners to execute multi-step physical operations.

Disclaimer: Views are the author's own and do not represent the platform. Do not reproduce without permission. Content is for reference only, not investment advice. Trade at your own risk.

WOOFUN.AI — Your Smart Crypto Assistant. Reconstructing the crypto experience with smart technology. We simplify the complex, break professional barriers, and enable everyone to embrace the digital future with confidence, intelligence, and joy.

iOS

Google Play

Android Apk

Market Ecosystem Alpha Paradise Lost Ratings News News Flash Calendar Exchanges Wallets