📁 LLM

The LLM archive monitors model releases, quantization updates, reasoning capabilities, and real-world deployment implications for local and hybrid AI. We focus on what materially changes selection and operations: context windows, latency, memory footprint, licensing, and evaluation evidence across open and commercial families. This section is designed for teams that need dependable model intelligence, not hype cycles. Pair these updates with the LLM pillar and references to hardware constraints and framework integration.

📁 LLM AI generated

AI and Humans Verify Fields Medal Proof for the First Time

An artificial intelligence has formally verified the mathematical proofs of Fields Medal winner Maryna Viazovska, accelerating mathematical research. The AI validated the solution to the sphere packing problem in 8 and 24 dimensions, demonstrating the potential of AI to assist mathematicians and opening new frontiers in large-scale formalization.

2026-03-02 Fonte

📁 LLM AI generated

Anthropic’s Claude reports widespread outage

Anthropic's AI chatbot Claude experienced widespread service disruptions on Monday morning, with thousands of users reporting issues accessing the bot. The incident raised questions about the stability of cloud infrastructures supporting large language models.

2026-03-02 Fonte

📁 LLM AI generated

Jan-Code-4B: a small code-tuned model of Jan-v3

The Jan team has released Jan-Code-4B, a small code-tuned model for coding tasks. Based on Jan-v3-4B-base-instruct, it aims to provide assistance in code development, generation, refactoring, and debugging, while maintaining a lightweight footprint for local execution. It can replace the Haiku model in Claude Code.

2026-03-02 Fonte

📁 LLM AI generated

PSA: Qwen 3.5 Requires BF16 KV Cache, NOT F16

A warning for those running Qwen 3.5 locally with llama.cpp: the KV cache needs to be manually set to BF16 (bfloat16) instead of the default FP16 (float16). Perplexity tests on wikitext-2-raw confirm that official Qwen-team implementations, like vLLM, use BF16, while llama.cpp defaults to F16.

2026-03-02 Fonte

📁 LLM AI generated

Qwen 3.5: new small version available

A new version of the Qwen 3.5 language model has been released. The 'small' version could enable more efficient deployments on hardware with limited resources, opening up new possibilities for on-premise and edge applications.

2026-03-02 Fonte

📁 LLM AI generated

Task-Lens: Cross-Task Utility Based Speech Dataset Profiling for Low-Resource Indian Languages

A new study introduces Task-Lens, a cross-task survey of 50 Indian speech datasets spanning 26 languages, assessing their suitability for nine Natural Language Processing (NLP) tasks. The research aims to overcome data scarcity by identifying untapped metadata and gaps in existing resources to enhance the development of inclusive speech technologies.

2026-03-02 Fonte

📁 LLM AI generated

Toward General Semantic Chunking: A Discriminative Framework for Ultra-Long Documents

A new discriminative model based on Qwen3-0.6B addresses the segmentation of ultra-long documents, overcoming the limitations of generative models in terms of speed and support for extended inputs. The model uses a sliding-window approach and vector fusion to improve downstream retrieval efficiency.

2026-03-02 Fonte

📁 LLM AI generated

U-CAN: Utility-Aware Contrastive Attenuation for Efficient Unlearning in Generative Recommendation

A novel framework, U-CAN, addresses privacy concerns in LLM-based generative recommendation systems. U-CAN mitigates utility loss during machine unlearning by selectively attenuating sensitive parameters in low-rank adapters, while preserving performance.

2026-03-02 Fonte

📁 LLM AI generated

REPO: Advanced Defense Against Toxic LLM Outputs via Representation Erasure

A new approach, called REPO (Representation Erasure-based Preference Optimization), aims to reduce the generation of toxic outputs by large language models (LLMs). REPO intervenes at the level of internal model representation, forcing the convergence of toxic representations towards benign ones, demonstrating greater robustness than traditional methods.

2026-03-02 Fonte

📁 LLM AI generated

Agentic LLM Framework for Adverse Media Screening in AML Compliance

A new system based on LLMs and RAG automates adverse media screening, a critical component of AML and KYC processes. The LLM agent searches, processes documents, and calculates a risk index, demonstrating the ability to distinguish between high-risk and low-risk individuals.

2026-03-02 Fonte

📁 LLM AI generated

The human part of learning is important

AI can speed up progress, but is reaching the destination without the journey worth it? Reflections on the importance of human experience in the age of automation.

2026-03-01 Fonte

📁 LLM AI generated

Qwen3.5 Small Dense model release seems imminent?

Rumors on Reddit suggest the imminent release of Qwen3.5 Small Dense. The open-source community is eagerly awaiting to evaluate the performance and potential applications of this model.

2026-03-01 Fonte

📁 LLM AI generated

LocalLLaMA: Growing anticipation for new features

A Reddit post sparks interest in the LocalLLaMA community, with speculation about the arrival of new features. The discussion highlights the growing interest in locally run LLM solutions.

2026-03-01 Fonte

📁 LLM AI generated

Qwen 3.5 27B: Best Chinese Translation Model Under 70B

A LocalLLaMA user reports that Qwen 3.5 27B offers Chinese translations comparable to GPT-3.5 and Gemini, outperforming other models up to 70B. The model was tested on a local setup with 24GB of VRAM, highlighting excellent tone and consistency.

2026-03-01 Fonte

📁 LLM AI generated

Google: Longer Reasoning Chains Don't Imply Higher Accuracy in LLMs

New research from Google challenges the assumption that longer reasoning chains lead to better results in language models. The study introduces the concept of Deep Thinking Ratio (DTR) to measure reasoning quality, demonstrating that accurate token selection can reduce computational load while maintaining or improving accuracy.

2026-02-28 Fonte

📁 LLM AI generated

DeepSeek V4: Image and Video Generation Capabilities Coming Next Week

According to the Financial Times, DeepSeek is preparing to release version 4 of its artificial intelligence model. The new version will include advanced image and video generation capabilities, positioning itself as a direct competitor to models developed in the United States.

2026-02-28 Fonte

📁 LLM AI generated

Qwen 3.5-35B-A3B: a surprising model for development tasks

A Reddit user reports exceptional results with Qwen 3.5-35B-A3B, a model that has replaced GPT-OSS-120B in their daily workflow. The user employs it for development tasks, process automation, and code analysis, highlighting its ability to compensate for a lack of knowledge with browser access.

2026-02-28 Fonte

📁 LLM AI generated

LocalLLaMA: Community Challenges Vendor Lock-in in AI

A Reddit user praises the LocalLLaMA community for its DIY approach to artificial intelligence, contrasting it with the industry's trend towards proprietary solutions and vendor lock-in. The use of consumer GPUs like the RTX 3090 to develop models locally is seen as a viable alternative and an example of bottom-up innovation.

2026-02-28 Fonte

📁 LLM AI generated

Monthly update on top-performing open-weight models

A monthly overview of top-performing open-weight models, evaluated based on community discussions and benchmarks. The initiative aims to provide an updated view of open-source alternatives to proprietary models, focusing on their capabilities and limitations.

2026-02-28 Fonte

📁 LLM AI generated

LocalLLaMA: a look back at the early days of local LLM inference

A Reddit post reminisces about the early days of LocalLLaMA, when running language models locally was a pioneering challenge. The discussion highlights how the open-source community pushed the boundaries of on-premise inference, paving the way for today's solutions. For those evaluating on-premise deployments, there are trade-offs to consider carefully.

2026-02-28 Fonte