On-premise LLM Agents: An Experiment with GPT-OSS 20B

A Reddit user shared their experience with the Zeroclaw agent, a framework designed to enable large language models (LLMs) to interact with their environment. In this case, the user utilized a GPT-OSS 20B model, run entirely locally, to automate tasks on a macOS system.

The user highlighted the agent's ability to interact with macOS applications, browse web pages, and manipulate local files, all while maintaining data sovereignty. The configuration required several hours of work to ensure that only safe tools were accessible to the agent.

Limitations and Challenges

Despite the initial success, the user noted that the GPT-OSS 20B model has limitations. In particular, it tends to lose focus after 15-20 steps and requires direct instructions to use persistent memory. Furthermore, anomalous behaviors occur when access to a tool is denied or when a tool returns an error.

For those evaluating on-premise deployments, there are trade-offs between control and computational resources. AI-RADAR offers analytical frameworks on /llm-onpremise to evaluate these aspects.