Baidu Open-Sources Unlimited-OCR Document Parsing Model With Reference Sliding Window Attention Mechanism

2026-06-23 18:45

Baidu has disclosed the Unlimited-OCR document intelligent parsing large model alongside a technical report, with industry speculation linking the project's CTO 'YY' to former DeepSeek-OCR core author Wei Haoran. Data compiled by Woofun AI shows that Unlimited-OCR achieved a score of 93.92% in the OmniDocBench v1.6 long document parsing benchmark, establishing a new end-to-end SOTA record.

To mitigate the linear surge in key-value cache (KV cache) that typically causes slowdowns and excessive GPU memory consumption in traditional models, Baidu deployed the Reference Sliding Window Attention mechanism (R-SWA). This approach limits the model's focus during decoding to all image features and a fixed window of recently generated text (default 128 tokens), keeping the total KV cache volume constant. Consequently, R-SWA prevents image detail blurring during window updates and ensures stable inference speed and GPU memory usage for documents exceeding 40 pages, delivering a 12.7% speedup compared to DeepSeek-OCR. Baidu has released the code and weights under the MIT license, supporting Hugging Face Transformers, vLLM, and SGLang, with plans to extend R-SWA to Automatic Speech Recognition (ASR) and translation tasks.

Disclaimer: Views are the author's own and do not represent the platform. Do not reproduce without permission. Content is for reference only, not investment advice. Trade at your own risk.

Trending News

TRUMP memecoin gains 3.15% to $1.90 with $281M volume while facing $2.27 resistance and weak ADX

Kalshi expands restricted jurisdictions to 55 including India amid global regulatory crackdown

Former BIS chief Agustín Carstens endorses stablecoin coexistence with fiat currency amid 100% reserve mandates

Bitcoin ETF inflows show $1B arbitrage versus $55B cumulative with 0.70 correlation to CME shorts

Bitcoin supply-in-profit breaches 15-year trendline as 10.2M coins fall below acquisition price

Binance co-founder Yi He exposes Zhu Pan impersonation scam while CoinUp denies ties

SpaceX stock drops 16.5% post-IPO as September 44% insider unlock looms over $75B valuation

THORChain resumes operations after $10.7M exploit and 1-month security overhaul

Franklin Templeton acquires 250 Digital to launch Franklin Crypto with $1.8B target by 2025

SOL drops to $71 as $13.66M long liquidations override $140M tokenized stock record