A user shared their experience with the Qwen3.5-35B and Qwen3.5-27B language models, focusing particularly on the efficiency in token usage and the responsiveness of the models in a specific use context.
Hardware and Software Configuration
The user runs the models on an RTX 5090 using llama.cpp (llama-server) at release b8269. The models are primarily used as "chat apps", with access to tools such as web search, image manipulation, and querying information on the home server.
Parameters and System Prompt
An interesting aspect is the use of default parameters for the Qwen3.5-35B-A3B and Qwen3.5-27B models. The user specifies that they do not set any additional parameters beyond the defaults. The system prompt is basic but effective, defining the model as a capable and precise assistant, trained by Qwen AI.
Efficiency and Data Sovereignty
The user emphasizes that they have not encountered "overthinking" issues with these models, contrary to what others have reported. The positive experience is attributed, in part, to the hardware and software configuration, but also to the minimalist approach in defining the parameters. The user runs the models locally, ensuring data sovereignty and complete control over the infrastructure.
๐ฌ Comments (0)
๐ Log in or register to comment on articles.
No comments yet. Be the first to comment!