Samsung is experimenting with REAM (REAP-less), an alternative approach to Cerebras' REAP method for reducing the size of large language models (LLMs).

REAM Details

REAM is proposed as a less invasive technique than REAP, minimizing the potential loss of model capabilities during compression. Several Qwen3 models have been released that have undergone reduction via REAM:

  • Qwen3-Coder-Next-REAM-60B
  • Qwen3-REAM-180B
  • Qwen3-22B

Open Questions

The community is questioning the effectiveness of REAM compared to standard quantization (Q3 or Q2), the resilience of REAM models to quantization, and the possibility of performing fine-tuning or reinforcement learning (RL) after applying REAM. Another point of interest is the potentially greater sensitivity of linear attention models to REAM and quantization.