Qwen 3.5 9B: a local LLM agent on M1 Pro MacBook

Qwen 3.5 9B as a local agent: tests on M1 MacBook Pro

A user shared their experience using the Qwen 3.5 9B language model as a personal automation agent, running it locally on a MacBook Pro equipped with an M1 chip and 16 GB of unified memory.

The goal was to assess whether a model of this size could handle real automation tasks without resorting to cloud services.

Setup and results

The configuration was simplified thanks to the use of Ollama, a framework that exposes an OpenAI-compatible API. The user was then able to direct their existing automation system to the local instance of Qwen 3.5 9B without code changes.

The results were encouraging:

Memory recall: The model correctly handled reading structured memory files and extracting relevant context.
Tool calling: Qwen 3.5 9B demonstrated the ability to invoke the appropriate tools for simple agentic tasks.
Complex reasoning: Limitations emerged, which were predictable given the size of the model.

The user also tested smaller versions of Qwen (0.8B and 2B) on an iPhone 17 Pro via the PocketPal AI app, demonstrating the possibility of running LLM inference completely offline on mobile devices.

Final thoughts

The experiment highlights how not all agentic tasks require large models running in the cloud. Many simple operations, such as reading files, formatting output, and summarizing notes, can be handled locally, with benefits in terms of cost and privacy.

🔍 Continue Exploring

Qwen 3.5 9B: a local LLM agent on M1 Pro MacBook

Qwen 3.5 9B as a local agent: tests on M1 MacBook Pro

Setup and results

Final thoughts

💻 Need GPU Cloud Infrastructure?

💬 Comments (0)

🔍 Continue Exploring

Explore LLM On-Premise

Qwen3-code-next test on Mac Studio Ultra: an analysis

Local LLM Agents: GPT-OSS 20B Tested on macOS

Local AI inference: possible even without a GPU

👥 Join 160+ AI explorers