Gemini 3.5 Flash can now see and control your screen, and Google wants enterprises to trust it

The capability that could shift the equilibrium for AI agents arrived without much fanfare: Google has made computer use a built-in feature of Gemini 3.5 Flash, the model unveiled at I/O 2026 as its fastest agentic LLM. Until now, enabling an AI to see a screen, click, type, and scroll across browsers, mobile devices, and desktops meant relying on a separate standalone model, with all the integration headaches that entailed. Now it’s native, immediate, and the Mountain View company is betting that enterprises will finally trust it.

What changes for developers

The move drastically simplifies the development pipeline for autonomous agents. Instead of orchestrating multiple models, with context handoffs and potential friction points, developers can now invoke Gemini 3.5 Flash directly to parse a UI and act upon it. This cuts perceived inference latency and reduces architectural complexity, a concrete benefit for teams building prototypes or enterprise integrations. Google hasn’t shared specific benchmarks, but the “fastest agentic” label suggests heavy optimization for responsiveness-critical tasks.

Why enterprise trust is more than a buzzword

But the real contest isn’t just about speed. Google knows that large organisations are intrigued by AI agents that can interact with business software, yet they hesitate when control and transparency come into play. Letting an LLM see and manipulate screens containing sensitive data—from financial dashboards to medical records—raises sharp questions about security and compliance. It’s no accident that the company stresses the need for “trust”: for those managing regulated infrastructures, every API call that leaves the data center can become an audit headache.

The unresolved sovereignty question

Here’s where Google’s news collides with the concerns of those evaluating alternatives to the public cloud. Computer use is currently delivered via API, inside an ecosystem controlled by an external provider. For organizations in highly regulated sectors, or those that have chosen an on-premise path to retain data sovereignty, the risk is ending up with an advanced feature that comes shackled to a consumption model they can’t adopt without exposure. This is why the evolution of LLMs isn’t just about performance: it’s about the ability to run inference locally, on owned hardware, without giving up evolved capabilities like screen control.

Beyond the headlines

The native integration of computer use into Gemini 3.5 Flash signals maturing AI agents, but it also reignites the debate over who controls the intelligence that acts on our behalf. In the coming months, pressure will likely mount on vendors and the open-source community to bring similar capabilities to models that can run self-hosted, reducing reliance on external APIs. For those who decide daily where and how AI workloads run, the real question isn’t “how fast” but “can I trust it to work with my data, on my infrastructure?”. Google has opened a door; the job for developers and architects is to decide whether to walk through it or build their own.