A user recently shared their impressions of the Qwen 3.5 397B language model, highlighting its performance in various tests.

Efficiency and cost

The most interesting aspect seems to be its ability to deliver valid results even without a particularly elaborate reasoning process. According to the user, this translates into a low inference cost, estimated at around $1. Some newer models require more in-depth reasoning, which can double inference costs.

For those evaluating on-premise deployments, there are trade-offs between initial (CapEx) and operational (OpEx) costs. AI-RADAR offers analytical frameworks on /llm-onpremise to evaluate these aspects.