Qwen3.5: User Analysis on Performance and Parameters

A user shared their experience with the Qwen3.5-35B and Qwen3.5-27B language models, focusing particularly on the efficiency in token usage and the responsiveness of the models in a specific use context.

Hardware and Software Configuration

The user runs the models on an RTX 5090 using llama.cpp (llama-server) at release b8269. The models are primarily used as "chat apps", with access to tools such as web search, image manipulation, and querying information on the home server.

Parameters and System Prompt

An interesting aspect is the use of default parameters for the Qwen3.5-35B-A3B and Qwen3.5-27B models. The user specifies that they do not set any additional parameters beyond the defaults. The system prompt is basic but effective, defining the model as a capable and precise assistant, trained by Qwen AI.

Efficiency and Data Sovereignty

The user emphasizes that they have not encountered "overthinking" issues with these models, contrary to what others have reported. The positive experience is attributed, in part, to the hardware and software configuration, but also to the minimalist approach in defining the parameters. The user runs the models locally, ensuring data sovereignty and complete control over the infrastructure.

🔍 Continue Exploring

Qwen3.5: User Analysis on Performance and Parameters

Hardware and Software Configuration

Parameters and System Prompt

Efficiency and Data Sovereignty

💻 Need GPU Cloud Infrastructure?

💬 Comments (0)

🔍 Continue Exploring

Explore LLM On-Premise

Qwen3.5 Support Merged in llama.cpp

Qwen 3.5-35B-A3B: a surprising model for development tasks

llama.cpp adopts Anthropic Messages API

👥 Join 160+ AI explorers