MiniMax-2.5: 230B LLM model runnable locally

MiniMax-2.5 is an open-source large language model (LLM) that is achieving state-of-the-art results in several areas, including code generation, the use of automated tools, and office automation tasks.

Technical details

The full model has 230 billion parameters (of which 10 billion are active) and, in its unquantized bf16 version, requires 457GB of memory. A notable aspect is its large context window of 200,000 tokens, which allows it to process and generate longer and more complex texts.

Quantization and hardware requirements

To reduce hardware requirements, a 3-bit quantized version is available through Unsloth Dynamic. This GGUF version reduces the model size to just 101GB, a 62% reduction. This optimization makes it possible to run MiniMax-2.5 on systems with more limited memory resources, opening up new possibilities for on-premise deployment.

For those evaluating on-premise deployments, there are trade-offs to consider. AI-RADAR offers analytical frameworks on /llm-onpremise to evaluate these aspects.

🔍 Continue Exploring

MiniMax-2.5: 230B LLM model runnable locally

Technical details

Quantization and hardware requirements

💻 Need GPU Cloud Infrastructure?

💬 Comments (0)

🔍 Continue Exploring

Explore LLM On-Premise

Minimax m2.1: A Promising LLM for Local Research

Minimax M2.5 weights to drop soon

Minimax Is Teasing M2.2: Busy February for Chinese Labs

👥 Join 160+ AI explorers