Gemini 3.5 Flash Integrates Native PC Control, Streamlining Enterprise Agent Architecture

2026-06-25 11:34

Woofun AI reports that Google has natively integrated the Computer Use feature into the Gemini 3.5 Flash model, allowing developers to control devices directly through the Gemini API or Google Cloud Gemini Enterprise Agent Platform without invoking specialized proxy models. This integration streamlines agent development architecture by leveraging screen captures from browsers, mobile, or desktop environments for visual perception and step reasoning, subsequently outputting operation commands such as mouse clicks, keyboard inputs, and menu navigation to automate tasks like software regression testing and cross-page data collection. To facilitate debugging, the model appends an "intent" field to explain the logic of each generated command.

To mitigate prompt injection risks in real network environments, Google implemented targeted adversarial training and introduced two optional protections: mandatory human approval for irreversible operations involving fund transfers or file deletions, and automatic task termination upon detecting indirectly injected instructions in screenshots. Browserbase provides an online hosted demo environment at gemini.browserbase.com, while Google has open-sourced the reference implementation code named computer-use-preview on GitHub.

Disclaimer: Views are the author's own and do not represent the platform. Do not reproduce without permission. Content is for reference only, not investment advice. Trade at your own risk.

WOOFUN.AI — Your Smart Crypto Assistant. Reconstructing the crypto experience with smart technology. We simplify the complex, break professional barriers, and enable everyone to embrace the digital future with confidence, intelligence, and joy.

iOS

Google Play

Android Apk

Market Ecosystem Alpha Paradise Lost Ratings News News Flash Calendar Exchanges Wallets