Qwen 3.5 9B as a local agent: tests on M1 MacBook Pro

A user shared their experience using the Qwen 3.5 9B language model as a personal automation agent, running it locally on a MacBook Pro equipped with an M1 chip and 16 GB of unified memory.

The goal was to assess whether a model of this size could handle real automation tasks without resorting to cloud services.

Setup and results

The configuration was simplified thanks to the use of Ollama, a framework that exposes an OpenAI-compatible API. The user was then able to direct their existing automation system to the local instance of Qwen 3.5 9B without code changes.

The results were encouraging:

  • Memory recall: The model correctly handled reading structured memory files and extracting relevant context.
  • Tool calling: Qwen 3.5 9B demonstrated the ability to invoke the appropriate tools for simple agentic tasks.
  • Complex reasoning: Limitations emerged, which were predictable given the size of the model.

The user also tested smaller versions of Qwen (0.8B and 2B) on an iPhone 17 Pro via the PocketPal AI app, demonstrating the possibility of running LLM inference completely offline on mobile devices.

Final thoughts

The experiment highlights how not all agentic tasks require large models running in the cloud. Many simple operations, such as reading files, formatting output, and summarizing notes, can be handled locally, with benefits in terms of cost and privacy.