📁 LLM

The LLM archive monitors model releases, quantization updates, reasoning capabilities, and real-world deployment implications for local and hybrid AI. We focus on what materially changes selection and operations: context windows, latency, memory footprint, licensing, and evaluation evidence across open and commercial families. This section is designed for teams that need dependable model intelligence, not hype cycles. Pair these updates with the LLM pillar and references to hardware constraints and framework integration.

The multimodal model Qwen 3.5-397B-A17B has been released as open source. This latest generation model promises high efficiency and native multimodal capabilities. The news was shared on Reddit, attracting the attention of the LocalLLaMA community.

2026-02-16 Fonte

Qwen3.5-397B-A17B, a large language model (LLM) developed by Qwen, has been released. The model is accessible via Hugging Face, opening new possibilities for research and development in the field of generative artificial intelligence. Its open-source nature fosters collaboration and innovation within the community.

2026-02-16 Fonte

The latest version of the Qwen language model, Qwen 3.5 Plus (397b-a17b), has been released on the Chinese Qwen application. The model weights are expected to be released soon, opening up new possibilities for developers and researchers interested in experimenting with this LLM.

2026-02-16 Fonte

A user expresses frustration with the prevalence of LLM models focused on code generation, at the expense of creative applications such as text writing or understanding context in complex conversations. He questions the scarcity of models optimized for tasks other than programming.

2026-02-16 Fonte

Sources indicate that Alibaba will release Qwen 3.5 today, a next-generation open-source large language model (LLM). The model is expected to feature significant innovations in its architecture, opening new possibilities for the artificial intelligence community.

2026-02-16 Fonte

A new study reveals that assigning demographic-based personas to large language models (LLMs) can introduce biases and degrade performance across various scenarios, with performance drops of up to 26%. The research highlights a critical vulnerability in LLM-based agentic systems.

2026-02-16 Fonte

A new approach, called abstractive red-teaming, aims to identify queries that violate the behavioral specifications of language models. The goal is to uncover categories of problematic questions before large-scale deployment, using reinforcement learning algorithms and LLMs to synthesize adverse scenarios.

2026-02-16 Fonte

According to Andrej Karpathy, the cost to train AI models like GPT-2 is decreasing by 40% annually. Improvements stem from better hardware (H100), optimized software (Flash Attention 3), advanced algorithms (Muon optimizer), and higher quality training data (FineWeb-edu). The article analyzes the key factors contributing to this cost deflation.

2026-02-16 Fonte

InclusionAI has released Ling-2.5-1T, an open-source language model with 1 trillion parameters (63 billion active). Trained on a corpus of 29 trillion tokens, Ling-2.5-1T aims to balance efficiency and performance, offering advanced reasoning capabilities and compatibility with agent platforms. The model uses a hybrid linear attention architecture and refined alignment strategies.

2026-02-15 Fonte

A technician optimized the inputs of a GPT-2 XL model to visualize the Bad Apple music video through its attention maps. The model, trained without images, required optimizing an embedding tensor and using an RTX 5070 Ti for approximately 12 minutes to process 3286 frames.

2026-02-15 Fonte

MiniMax-2.5, a new open-source language model, stands out for its coding, tool use, and office automation capabilities. The full version requires 457GB of memory, but a 3-bit quantized version drastically reduces its size, paving the way for execution on local infrastructures with more accessible hardware requirements. The model boasts a 200K token context window.

2026-02-15 Fonte

According to some studies, OpenAI's GPT-5 demonstrates a better understanding of the law than human judges. However, the question remains whether artificial intelligence is really ready to replace legal professionals, raising ethical and practical questions.

2026-02-15 Fonte

A Reddit user shared their experience training a small language model (4 billion parameters) to prove complex mathematical theorems. The discussion focuses on the techniques and resources used to achieve this goal.

2026-02-15 Fonte

For the first time, the top four models on the OpenRouter leaderboard are all open-weight. This marks a potential turning point for the adoption and trust in open-source language models, offering viable alternatives to proprietary models.

2026-02-15 Fonte

The JoyAI-LLM-Flash open source large language model (LLM) is available on Hugging Face. The LocalLLaMA community on Reddit has shared links and images related to the model, paving the way for discussions and potential local uses. The model is developed by jdopensource.

2026-02-15 Fonte

NVIDIA announced that Nemotron-3 Super and Ultra models are being pre-trained using FP4 precision, leveraging the high FP4 throughput of NVIDIA GPUs. The models are expected to be released in the first half of 2026. An interesting aspect that emerged from an interview is NVIDIA's vision as a "company of volunteers," emphasizing a decentralized and self-organizing approach to model development.

2026-02-14 Fonte

A call to rediscover the experimental approach in LLM development, focusing on unique and unconventional datasets. The article suggests exploring new frontiers, moving beyond the current trend towards homogeneous models and standardized virtual assistants, to achieve more original and interesting results.

2026-02-14 Fonte